Posts tagged: Learning

Black Box Black Box Testing

One of my friends is in College, and is currently feeling the full idiocy of a system that was only beginning to be rolled out as I left. Let me explain how it works.

Essentially, the system is meant to test the students’ solutions to homework problems. This is done by providing a solid definition of what the input and output of the application are supposed to be on the standard in/out channels, and setting up a whole bunch of test cases, including a memory limit and a CPU time limit. Students submit their source code to the system, which compiles it and runs all the test cases against the application in a black box test. So far, so good.

Seeing these guys at work, compared to me and my colleagues at work, makes a few things very apparent: even with a fairly solid grasp of algorithms and datastructure, their number one problem is code. Where professional programmers swim through code like sharks in the sea, the students appear to be more or less drowning. Theoretical learning aside, the education lacks practical programming, debugging, practical programming and some more practical programming.

It would seem that these programming exercises would be the perfect opportunity to get that kind of experience, if it wasn’t for the fact that the test system is itself a black box. You put in your code, and it tells you yes or no. It’s not quite a boolean pass/fail answer, but close enough: you will get told a result from the set: Didn’t compile, Passed, Failed, Crashed, Time Limit Exceeded. When I first heard of the system, it was motivated with the fact that sometimes in professional programming, that’s all you get.

I agree. Sometimes, you get gnarly bugs that give you less information than a world pro’s poker face. I’ve spent weeks tracking bugs like that sometimes, using all kinds of tools at my disposal to try to wring more  information out of the error, until finally the knot was untied. But — for all the bugs like that I’ve been through, none of them were eventually solved by guessing what was wrong and how to fix it.

Supposedly, the tool is meant to teach the students to debug their code… which it somehow does by disallowing all normal debugging tools. You can’t run a debugger on it, you can’t print traces, you’re not allowed to log to a file or socket, you’re not even allowed to know what input caused the error. The only tools you have at your disposal are your wit in coming up with your own test cases and code reviews.

Any attempts at normal debugging would be classified as cheating. If I was faced with a bug under those circumstances, I would do whatever I could to get more information out of it. Hey, I can crash it with different signals — that’s a few bits of information I could get back from it. All those kinds of tricks of the trade that real programmers use to, you know, solve problems… would be cheating.

This leads to a skewing of results… very simple bugs turn into monster problems, since you can’t identify and fix them. What they are learning is not how to debug their programs but how to painstakingly solve the very specific problem of pleasing the system. By artificially making easy things hard, the system has effectively found a way to avoid teaching the students essential skills in programming: simple debugging tools like tracing and breaking into a debugger. Instead, they learn programming by coincidence: poke something until you (hopefully, eventually) get a green light.

That’s not a lesson to learn.

The only way to go about this, faced by the obstacle made up of this system, is to learn a different skill: testing. More on that later.

More on studies: An Exceptionally Stupid Idea, Go Tinker, What’s a Good Final Year Project?

Don’t Be an Open Source Douchebag

I love open source software. It provides both a neat training ground for programmers, a good place to go scratch that itch. On the other side of things, it provides awesome software for people, including some software that would never come out of a big development house.

Still, there are some issues with free software that don’t really show up to the same degree with commercial software. One such thing is documentation. It’s painfully obvious that documentation is written by people who:

  1. Already know the software in and out.
  2. Don’t like writing documentation.
  3. Know nothing about how people learn.

For instance, when I started a side project a few months back, I was looking for a build system. After settling on CMake, I set about trying to make sense of it. There’s the ever-present getting started example, of course. And then there’s the full reference of everything you could possibly want (almost).

But in between those, there’s nothing. Well, nothing except a book, which just goes to show you that there’s something missing — a professional writer could obviously make some money out of explaining things in a reasonable way.

The problem with this is that it doesn’t match how people learn. Getting started is a good step, but a relatively small one. Most of the time will be spent incrementally expanding the knowledge, moving from beginner to expert. Most time will thus be spend in some kind of zone in between the “getting started” and “reference of everything” levels.

Worse than that, some open source programmers have a tendency to view their full reference documentation as an appropriate resource for everyone. “It’s all in there,” right? But pointing a beginner at a 40-page document detailing all the options of some application when all they want is to run it properly isn’t very helpful. I’m sure you know what I’m talking about if you’ve ever used an open source command line tool.

That ends us up with the really dark side of free software culture. The true douchebags out there will not only be extremely smartass in their RTFM comments, they’ll also be incredibly sensitive and defensive about the software they’re working on.

I ran into a problem with cygwin’s SSHD implementation last week. In searching for the solution, I found this mail list answer:

  Wrong.  That is uninformed speculation and guesswork.  Stop
spreading misinformation.

  Cygwin SSHD has had the support for fully logging in as any
user since 1.7, as you have already been told and completely
ignored.  Go and read the manual.  The link was in the previous
email I sent in this thread.

  freesshd works exactly as Cygwin *used* to before it got
subauth support: when you log in with a key, rather than a
password, you just end up as an admin user.

Wow. This kind of answer is wrong on so many levels. First of all, while he makes it seem like the functionality has been there forever, cygwin 1.7 is still not even out of beta. The chance that an end user has it is about 0. So, with the current version (1.5),  supposedly cygwin sshd works just like freesshd. This is clearly false, because the original poster reports one working and the other not (which is, by the way, exactly the same results that I had).

So, a user reporting a problem about logging in gets pointed to a long documentation about security settings in a beta version, doesn’t understand a word from that document (no surprise there), and as a result gets told to “stop spreading misinformation”. Truth is, simply installed like any normal user installs applications, one works and the other doesn’t, something made quite clear by an answer from the original poster in a different place in the thread:

> Are you talking about password or public key authentication?
> If the latter, Have you tried the LSA authentication package
> in Cygwin 1.7?
I don't know. I'll try to deciper that. Sounds complicated. In
the meantime, friend is using freesshd.

The essence of what he’s saying (which has been completely missed by the cygwin developers) is that the effort required to get cygwin to work like one would reasonably expect of it is much higher than the effort required to just google for something that just works out of the box. The fact that you could potentially make it work is irrelevant, because he’s not getting any help actually making it work.

He might as well just have said, “I don’t care about making it work for you. It works for me.”

Software companies usually compensate for their complete lack of useful technical support with a good (or at least reasonably decent) amount of help documentation. Free software usually has neither.

I encourage any programmer to practice their technical skills on an open source project. But while you do so, take the opportunity to practice your people skills a bit as well, or why not your writing skills? Don’t be an open source douchebag — someone reporting your software’s flaws is not attacking you personally.

What’s a Good Final Year Project?

Here’s a question from the mailbag, coming from a student doing games programming at a university, gearing up for his final-year project:

The degree I’m doing currently has been very much centered around graphical programming, aswell as using various programming languages to bolster our CV’s I would imagine. We’ve not really touched on networking, or AI at all. Our main language in graphical programming has been OpenGL too, we’ve dabbled with DirectX a little, mostly managed directX with XNA.

As a programmer already in the industry, could you give me any advice on the type, and level of project I should aim for, for it to be a suitable demo project to show to prospective employers? It’s difficult enough as students to find our way into the games industry; if we don’t know what employers want, or what is considered worthy in the eyes of the employer, it lowers our odds tenfold.

In a way, the answer to the question depends on what kind of position you’re trying for. I’ll try to answer with regards to as many positions as I know.

For starters, is graphics programming what you want to be doing? If it is, things like what graphics APIs you know become a lot more important. From what I know, most games studios that do windows game development have gravitated towards DirectX. With that in mind, having done a large(ish) project on DirectX could definitely be a plus. If all other things are equal, I’d definitely go for learning DirectX.

Not having experience with networking or AI isn’t all that much of an issue, unless you imagine going specifically for a networking or AI programming position (and even then, it’s more of a bonus than a requirement). One thing is clear however… if you’re looking to get into professional development at a large studio, you should be going for C++.

Now if we look past the technical requirements, there are a few things that can be said about such a project in itself. What you get from a good project is a nice entry into your demo reel — something to show off. This has a few implications, but most importantly something I’ve touched on before, in A Spectacular Failure: your project is going to be judged on emotional first impressions, not how technologically advanced it is or how nicely coded it is.

Your number one enemy is over-scoping the project, ending up with something that does lots of things, but does none of them in a great way. Come up with a good core gameplay for the game, and then polish it to a great shine. Fix all those annoying glitches and bugs, make sure everything looks as impressive as possible. It doesn’t need to be rocket science, as long as it’s well executed. In the end, what a games studio is looking for is a programmer who knows how to finish projects in a good way.

That doesn’t mean your project should be Space Invaders, but in general trying for something too big is more of a problem than overdoing something too small.

Finally, as an entry on your demo reel, make sure you make the game available in an easy manner. Have a page with plenty of screen shots, videos and preferably the game itself easily downloadable. Your coding ability will definitely be tested with some form of work sample as you apply to studios, so the code itself being available is less important. Reading code is hard, so it’s unlikely that someone will have time to read yours. However, having a finished game to show off is worth a lot, as is the experience of going through all the phases in finishing a game.

Other posts you may find interesting, relating to getting into the games industry and getting started with games:

Design Fundamentals: Abstraction

Design Fundamentals is my series on code design aimed at people who may have done some code in school, but haven’t done much code design, or who haven’t read much about design before.  This is the second article of the series.

Code design is all about making sense of a system. Any software worth mentioning quickly grows larger than your brain can easily track. A good design will ensure that each piece of the program can be safely worked on in isolation — that whatever you need to do, it’ll fit within your working memory.

Abstraction is a key component to keeping the amount of details you need to remember at once at a minimum. By abstracting a piece of code, you remove the details and only have to think about the abstract interface to the code, which should be smaller (or the abstraction was a bad idea).

So exactly how does abstraction work? I’m sure you’ve done some version of it before, maybe even without being aware of it. At a low level of abstraction, the code is filled with all the details. At a high level of abstraction, code uses concepts that hide details. So in abstracting code, we lift out the details to some other place — maybe a class or a function.

When to abstract

While you can debate any kind of rules for when to abstract (just like most things in code), I’ll attempt to give you some guidelines. Adapt them as they suit you and change them if they don’t make sense in your context.

For functions, what you can keep in your head is strongly related to what you can see on your screen. As soon as a function has you scrolling up and down, you’ve got a good sign that the function is too long and that you should abstract something away. For me, the maximum length of a function that is normally reasonable is about 50 lines — the exact number will vary with personal preference and the content of the function.

For both functions and classes, the relationship with other classes or functions also matters — specifically the fan-out (amount of other classes your class uses, for instance). You want to keep the fan-out low, since otherwise you risk forgetting the details of something while working on the code (or confusing the next coder who works with it). Steve McConnell has this to say about it in Code Complete (a recommended book if you’re interested in code and design):

Low-to-medium fan-out Low-to-medium fan-out means having a given class use a low-to-medium number of other classes. High fan-out (more than about seven) indicates that a class uses a large number of other classes and may therefore be overly complex. Researchers have found that the principle of low fan-out  us beneficial whether you’re considering the number of routines called from within a routine or from within a class.

The same thing applies with the number seven as with the above line count — use your judgement, but keep in mind when the fan-out is starting to increase that you might want to look at your design again.

Common abstraction mistakes

The number one mistake people do when it comes to abstraction is to confuse the need to abstract with the interface of classes that is exposed to others. Often, when abstracting a part of a function to a new function, it doesn’t need to go into the interface of the class you’re working with at all — it can simply be placed in an anonymous namespace at the top of the file you’re working on.

Another mistake that is fairly common is to treat abstraction as a simple cut and paste activity, where a number of lines of code are simply moved to another place. Stop to consider where to make the cut — what part of the code you’re working on is a logical unit that does something reasonable? If you end  up with a function named partOfMyOtherFunction() or something similar, you haven’t really achieved anything.

Keep thinking about your abstracted functions — do they share characteristics? Often you might find yourself creating helper functions that would fit better grouped together in a helper class. This class can also reside in the same file — there’s nothing which forces you to create a new header file and source file in order to abstract code.

Building abstract code

If possible, build your code in abstract layers from the start. Consider what parts you might need when implementing your functionality, and make sure implementation details are well partitioned off. Especially things like system differences and interfaces towards the operating system is good targets for an abstraction layer.

Encapsulate the system functionality in order to not have to bother with details everywhere in the code. This is especially useful when interfacing with badly designed APIs, to ensure that this doesn’t spread to all of your code. This lets the rest of your code think about the system functionality at a higher level, without the details.

Design Fundamentals: Encapsulation

EncapsulationEncapsulation is one of the core design patterns of Object Oriented programming. The point of encapsulation is to split a large program into a number of smaller, independent parts to reduce complexity. In a sentence, encapsulation is hiding the implementation details of a module from its user.

The point of doing this is that it’s easier to use a module with a well defined interface and it’s easier to change the implementation if fewer things depend on it. If you expose the implementation of a module to its user, you can bet the user is going to end up in some way or other dependent on the implementation details. This means that the risk of breaking something increases every time you make a change to the module.

Why does this not become an apparent problem for a new programmer? There are two major contributing factors. One is that new programmers tend to write small to medium sized programs, and that dependency-based problems tend to become apparent only in large applications. In a small application, you’ll have a few users of your module, and when you break the implementation dependency, maybe one of them breaks. You fix the error and move on. Simply put: You don’t need design skills to write “Hello, World!” applications.

Experienced Software Engineers tend to write large to massive program systems. If you have a lot of things dependent on the implementation details you’re changing, chances are a lot of things will break, and maybe some of the errors won’t be apparent until much later. Even better, if all your components are dependent of implementation details of others, every change you make is virtually guaranteed to break something.

The second reason initiate coders fail to notice the need for encapsulation is that the errors that spring from this are delayed. There’s nothing immediately wrong with your code — it works. That’s why this is a question of design, not of code.

So how do you properly encapsulate code? Start by looking at your interface for the class. How does one interact with it? Consider what it means — does the interface tell you what it’s doing, or how it’s doing it? If you find it’s telling you anything at all about how it does things, consider what you could do to hide it.

Another useful way to think about a class interface in terms of encapsulation is “how could I break this object’s functionality“? If you find you could break it, something’s wrong with your encapsulation. A properly encapsulated object guarantees the consistency of its own state at all times. This way of thinking about a module leads to robust interfaces and code.

A good starting point for encapsulation is to make all your internal data private, and to expose only retrieval methods for those things the outside world has any business knowing about (languages with Properties like C# avoid this, but without them you’re better off doing this). Remember, one point of encapsulation is that you should be able to change the implementation without changing the interface, and if your internal variable are public, you can’t change how you store data.

There’s a tendency to expose other complex objects that your class owns by direct accessor functions as well. This is usually a mistake. In essence, you’re giving up control over these objects, which means you can no longer guarantee your internal state is consistent and you can’t switch to a different kind of object.

Consider an example of a logging facility that has an output stream. There might be a temptation to do something like this:

log.getOutStream() << "Testing, testing";

This is where you break encapsulation, becaue suddenly you have no idea what’s being done to the log stream. Maybe someone saved a reference to it somewhere? Is it safe to delete it? Did someone start a row and leave it unfinished (like I did above)? Dunno.

Also, you’re now stuck unable to change to logging over a network, onto a printer, into a GUI or whatever else could be useful to do (unless you want to derive your own iostream, which I wouldn’t recommend).

An option is to encapsulate the logging service fully, and only provide an interface to do things with (log text):

log.addTextRow("Testing, testing");

This way, whatever you do in addTextRow() is your own business, as long as logs get made.

As with most design issues, getting the design right when it comes to encapsulation is hard the first time around. You’ll almost never get it right the first time around. That doesn’t mean you shouldn’t try your best though — the reason you get it right the second time is that you notice what happened the first time you tried.

Many new coders tend to mix up project process with design process. Hearing about waterfall (and it’s negative sides), they say “I’m not doing waterfall”, which to them means not to do any design up front. This is a big mistake — think about your design all the time. When you sit down to write a new class, you should have a design idea in mind — trying to nail on design onto a kludge of code is a lot harder than to rework a bad design into a good one.

What does this have to do with encapsulation? It means you should make your variables private from the start, and immediately start thinking about your interface. That way, you’re not going to have to realize halfway through that you’ve missed encapsulating things. Nearly always, the key to getting the design right is to start thinking about the design.

When I recently sat down to implement a new feature of a side project I’m working on, the first thing I did was create 10 empty files, because I’d already thought the design through enough to know I’d need those classes. That doesn’t mean you should do all your design up front, and never touch it again (waterfall-style), it just means there’s nothing wrong with thinking about design before you start coding. In fact, I very much encourage it.

WordPress Themes