четверг, 8 января 2009 г.

Human factor

But wait, perhaps we can optimize it further!

At work today I saw this little beauty at the end of a method with void return
type:

if(true)
return;

I thought it was quite wise of him to check that. Incidentally, the author of
this little koan now works at Google.

The Ideas in the Spring Framework

The proponents of new technologies always seem to see themselves as involved in
some kind of populist rebellion. For example, judging from rhetoric, I am pretty
sure that Ruby programmers see themselves something like this:

I think the underlying reason these little wars go on is that so few people are
in a position to understand the trade-offs involved in a technology choice
because understanding the trade-off requires understanding both alternatives. It
takes a great deal of experience, and the more fundamental the change, the
longer it takes. It is easy for me to like, for example, Haskell, since I have
never had to build a large, reliable, non-trivial piece of software in it, I
only have to see the benefits.

In any case Java had its own little rebellion not so very long ago producing a
slew of в__light-weightв__ technologies (Javaв_Ts conception of what is light-weight
is very different from anyone elseв_Ts perspective so we better keep that in the
quotes). The whole thing of periodically turning against the tools we use, is a
bit counter-productive and kind of juvenile. But I am willing to accept a bi-yearly
cycle of ecstasy and loathing to work in an industry with a little passion.
Probably foremost in this mini-rebellion was the Spring Framework, which bills
itself as a sort of add-on to J2EE. The core idea in Spring is в__dependency
injection,в__ which is billed as a better way of building applications out of
loosely-coupled components.

It is very difficult to evaluate the effect of an idea like this which claims to
improve software architecture because small programs have very little need for
anything like architecture. As a consequence it is very difficult to explain the
advantage of the new idea, because the advantage only because significant when
you have a program of a certain size. So evaluating this kind of idea requires
implementing a large system multiple times in different ways to see which is
better. It should be no surprise that this takes years (the best example is the
microkernel debate, where, many complete production quality OSes later, we still
have both).

In any case, Spring remains quite popular and just released version 2.0. I have
been using Spring for a while and I think I am ready to start reflecting on how
good of an idea the whole dependency-injection-as-core-architecture choice
really is. For the record I think Spring is about as good as you can do in Java-land
right now. But there is a whole world of difference between better than others
and good.

For those who havenв_Tt stumbled across it already the idea of dependency
injection is to make component dependencies be parameters of your objects rather
than hard coding them. Thus if your program does logging (for example) using
some home grown system you might do the following:


class Example{
private Logger logger = FileSystemLogger();

public void doSomething(){
logger.info("Doing something...");
...
}


The good news is that your class mostly only depends on the Logger interface (say,
info, error, warn, etc.), the bad news is that you have hard-coded an
implementation of Logger, FileSystemLogger when you initialize the logger. There
are a lot of ways to abstract away this dependency, but the absolute simplest is
is the following:


class Example{
private Logger logger;

public Example(Logger logger){
this.logger = logger;
}

public void doSomething(){
logger.info(в__Doing somethingв_│в__);
в_│
}
}


In this way whomever instantiates the Example class will have to provide the
Logger implementation. And we have rid ourselves of the dependency.

This example hardly does much justice to the idea, you have to imagine you have
a large collection of services bound together in this way, and then you can
begin to imagine the benefits.

But in Spring the dependency injection idea is co-mingled with a few other ideas:

XML configuration

Aspect-oriented programming

Fixing J2EE

Programming to Interfaces

Fixing J2EE is where Spring is most clearly successful. If you are using
straight-up JDBC you can switch to Springв_Ts JdbcTemplate and write 1/5th the
JDBC code immediately and get free transaction handling too (which now comes
with a nifty little @Transactional annotation).

The XML configuration bit is more debatable. In Spring one would implement the
above example as follows [1]:







Actually the XML itself is a terrible way of wiring things together when
compared with Java. XML wonв_Tt enforce type-correctness, requires the programmer
to learn a whole new set of Spring-specific sematics. Rather than having a
applicationContext.xml file with Spring configuration XML, it is vastly simpler
to have the following:

class ApplicationContext{
ApplicationConext(){
Logger logger = new FileSystemLogger();
Example myExample = new Example(logger);
...
}
}

I thought I had invented this idea when I first thought of it, but a little
research shows that it turns up at the end of Martin Fowlerв_Ts essay on Inversion
of Control, and a very intelligent discussion of the details of using Java-based
dependency injection is given in this article. The primary advantage of using
code is that the flow of control in the application remains fully documented and
understandable. It is a well known problem that the Spring configuration for a
large project can become a problem in and of itself. Spring has mechanisms
within it help avoid configurartion file madness, but frankly they cannot
compete with the mechanisms in Java itself (e.g. type-checking, inheritence, etc.).

The claimed advantage of a configuration file over source code is that one can
change the configuration without having to recompile. It is true that many
organizations treat configuration changes less seriously than code changes, but
that is because these organizations only support primative configuration (e.g.
string or number values). Spring allows you to completely rearrange the flow of
control via configuration (e.g. disable all transactionality, say, or in general
string together completely untested software combinations). It is true that
simple properties are often maintained via dependency injection, but these are
typically externalized to a properties file just as they would be without
dependecy injection. I doubt most organizations that use Spring seriously allow
application context xml changes without at least cursory QA verification, in
which case a new build is not really much of an issue.

So I canв_Tt say much for the XML part of Spring. I still think programs are
better off as code.

The next idea in the Spring meddly is the idea of programming to interfaces. The
phrase programming to interfaces is much older than Java itself, but when Java
programmers use it it is not entirely clear what they mean. That is, there is
programming to interfaces, and then there is programming to Java interfaces.
Programming to interfaces means creating a well-defined set of publically
accessible operations for your object and using only those operations. It is an
excellent idea and you can practice it in virtually any programming language (for
example the Linux kernel does a lot of programming to interfacesв_"e.g. the
filesystem interface). But the Spring people seem quite convinced that you
should program to Java interfaces, that is you should create a Java interface
for, well, everything. The only benefits of this practice is that it will make
your project look very complicated which may impress your coworkers. If you are
being payed on a per-line-of-code basis you will incur a nice little 10% bonus.
The downside is that, oh yes, there will be more code. And every time you add a
parameter to some method in class A you will also need to add it to interface A.
In other words this advice boils down to в__repeat yourself unnecessarilyв__. This
kind of thing is particularly irksome in the early stages of development when
interfaces are evolving. I will go ahead an claim that there is not one single
advantage in application development of programming to Java interfaces over the
following:

You need, oh I donв_Tt know, say a logging class, and you suspect that in the
future there may be many logging implementations.

You create a class Logger.java which contains the implementation you initially
plan to use. You put thought into the operations possible on a Logger, even
considering the future implementations you may need. You do not create any
interfaces.

You check-in your code and release this version of your project.

95% of the time it ends here, but 5% of the time it becomes the case that
logging becomes much more important, and you need to support more ways of
logging.

You use the refactoring functionality in your IDE to seperate Logger.java into
an interface Logger.java and an implementation FileSystemLogger.java. This takes
all of 3 seconds. Actually this is a nice little advantage of the Java
convention of naming classes with implementation-specific details (i.e.
ArrayList), and naming interfaces with a simple generic name (e.g. List).

Notice how 95% of the time creating the extra interface up front is just a waste
of time, and the 5% of the time it isnв_Tt you loose those 3 seconds required to
fire up the IDE.

Now this doesnв_Tt always hold true. It may be that you need to support two
implemenations right from the get-go, in which case a Java interface is exactly
what the doctor ordered. It also might be that you are writing a library, not an
applicationв_"which is totally different. A library is full of code that is meant
to be used without changing the code itself, and so it has to support any kind
of reasonable extension. In this case up front interfaces are a requirement. But
just because you are often annoyed that, say, the Java standard library didnв_Tt
do this, doesnв_Tt mean you should go litering libraries thoughout your own
application. I think one of the reasons for the complexity of many Java projects
is because the programmers have taken too much advice from library developers (who
are usually very good programmers and worth listening to) and they have started
to act as if they did not have access to their own code. This is why your
project at work has 4 trillion lines of configuration, any serious change to
which will end in breakages that you will only be able to detect at runtime.

Part of this comes from the fixation many programmers have with object-oriented
programming. To put this simply let me rank three things from best to worst:

Simple requirements

Complex requirements satisfied with two seperate polymorphic implementations (e.g.
an interface and two implentations)

Complex requirements satisfied with a morass of if statements and case-wise
logic.

When you have multiple states of a system, you have a potentially combinatorial
explosion of system states. That is, if you have 10 interfaces with 3
implementations each then you have 3^10 = 59,049 cominations to test. It doesnв_Tt
matter if you use the complex set of case-wise logic or the clean programming-to-interfaces
styleв_"the complexity is still potentially exponential.

Fortunately Spring only recommends that you make all these interfaces, you donв_Tt
actually have to. I think the reason must be that it was an early technical
limitation in Spring before cglib. As usual when defending a technical
limitation people pretend it is not a limitation at allв_│

Finally we come to Aspect-Oriented Programming. First of all, I think the Spring
people are right that the fine-grained aspect orientation (e.g. in AspectJ) is
not worth the complication, all that is really needed is the ability to wrap
methods with reusable services. But how hard is it, really, to proxy a method?
Before we go any further, letв_Ts look at how easy this is too do if one has
access to first order functions. Here is an implementation in pseudo-javascript:

function makeBeforeAdvisedFunction(fun, advice){
return function(){ advice(arguments); return fun(arguments);}
}

See how it works? You give a function, and you get back a function which works
exactly the same, but with advice applied.

Of course this isnв_Tt quite as powerful as Spring AOP which allows you to apply
advice to many methods with a single regular expression. But is this really that
necessary? I mean if applying the advice is all of 10 extra characters then is
it such a big deal to add it explicitly? Furthermore, isnв_Tt it beneficial to
have everything right there, explicitly spelled out, rather than having code
invisibily inserted at runtime?

I donв_Tt really know the answer to this. I suspect that the phrase Aspect-Oriented
Programming is a complete misnomer, since your project is unlikely to become
Aspect-Oriented. It may feature Aspects for say four or five essential services,
but the idea that it would become completely oriented towards aspects is a
complete oversell[2]. The whole point is that there are a few key things
repeated all over. The ones that most all programs share are transaction
management, caching, and audit trails (recording all the changes to an object),
and security checks, plus a few that are probably domain specific.

A truely aspect-oriented program sounds like a disaster. How would one debug
such a beast if it were composed of bits and pieces of code glued together from
all over without any clear control-flow? But for the specific examples above it
is a compellingly non-invasive way to add a reusable services on top of an
object.

[1] Actually Spring advocates using setter methods instead of contructor
arguments for setting properties. They argue that this is more flexible. I agree
that it is more flexible, but I think it is more dangerous. First it forces you
to implement setter methods for properties, even though it may be either
meaningless or disasterous if anyone calls those setter methods after
initialization (for example what happens if someone calls setDataSource on your
data-access object in the middle of a transactionв_│I donв_Tt know but i canв_Tt think
of what a correct behavior is in this case). Secondly it makes it very
troublesome to ensure the validity of your object because one cannot enforce any
requirement on the complete set of parameters. This deficiency is partially
overcome by the @Required annotation available in Spring 2.0. But of course this
means every property must be either required or not, it canв_Tt be the case that
one either provides A or provides B.

[2] In addition to a tendency to oversell itself the whole Aspect-Oriented thing
is completely terminology heavy, and the terminology is particularly wretched (point-cuts,
join-points, etc.). Itв_Ts a pretty off-putting combination of commercial
salesmanship and the pseudo-academic phraseology they developed. But you know,
we shouldnв_Tt let that kind of thing sink good ideas if they are there.

Special Topics In Calamity Physics

Just finished reading this awesome Prep School/Coming Of Age/Thriller thing. Donв_Tt
start it unless you have some time, though. I really did nothing but read for
about three days.

Next up The Emperorв_Ts Children, this should round out my trendy bestselling lit.
indulgence.

5 Principles For Programming

Here are a few things I have learned about programming computers, in no
particular order. I didnв_Tt invent any of them, and I donв_Tt always follow them.
But since nobody seems to know very much about making good software, it makes
sense to try to distill a little wisdom when possible.

Fail Fast

Check for programming errors early and often, and report them in a suitably
dramatic way. Errors get more expensive to fix as the development process
progressesв_"an error that the programmer catches in her own testing is far
cheaper then one the QA tester finds, which is in turn far cheaper than the one
your largest customer calls to complain about. The reason this matters is that
the cost of software comes almost entirely from the errors. To understand why
this is, consider writing code in the following manner: you are assigned some
feature, you type up a complete implementation all in one go, then you hit
compile for the first time. (I TAв_Td a beginning programming class in grad school
and this is not very different from how beginning programmers insist on working.)
The point is that if you have any experience writing software you know that if
getting to the first compile required n man-hours, then the time required to
having shippable code is probably between 2n and 100n man-hours, depending on
the domain. That time will be divided between the programmerв_Ts own bootstrap
testing, QA time (and the associated bug fixing), and perhaps some kind of beta.

The classic examples of this principle are type-checking, unit testing, and the
assert statement. When I first learned about the assert statement I couldnв_Tt
accept that it was usefulв_"after all, the worst thing that can happen is that
your code can crash, right? and that is what the assert statement causes. For
all the hoopla about unit testing, you would think that it was something deeper
then just a convention for where to put your assert statements. But software
development is in such an infantile stage, that we shouldnв_Tt poke funв_"unit
testing, for all the child-like glee of its proponentsв_"may well be the software
engineering innovation of the decade.

You see the violation of the fail fast principle all the time in Java
programming where beginning programmers will catch (and perhaps log) exceptions
and return some totally made up value like -1 or null to indicate failure. They
think they are preventing failures (after all whenever an exception shows up in
the logs it is a problem, right?) when really they are removing simple failures
and inserting subtle time-sucking bugs.

Unfortunately the idea of failing fast is counter-intuitive. People hate the
immediate cause of pain not the underlying cause. Maybe this is why you hear so
many people say they hate dentists, and so few say they hate, I donв_Tt know,
plaque. This is a lot of what irritates people about statically typed languagesв_"when
the compiler complains we hate the compiler, when the program does what we say
and crashes we hate ourselves for screwing upв_"even when it is an error that
could have been discovered by a more vigilant compiler.

This is why I canв_Tt work myself into the same first-kiss level of ecstasy others
manage over languages like Ruby[1]. Dynamic code feels great to program in.
After the first day you have half the system built. I did a huge portion of my
thesis work in Python and it was a life saver. Thesis work doesnв_Tt need to be
bug free, it is the quintessential proof-of-concept (and yet so many CS students,
when faced with a problem, break out the C++). But I have also worked on a large,
multi-programmer, multi-year project, and this was not so pleasent. A large
dynamically typed code base exhibits all the problems you would expect:
interfaces are poorly documented and ever changing, uncommon code paths produce
errors that would be caught by type checking, and IDE support is weak. The
saving grace is that one person can do so much more in Python or Ruby that maybe
you can turn your 10 programmer program into three one programmer programs and
win out big, but this isnв_Tt possible in a lot of domains. It is odd that
evangelists for dynamic languages (many of whom have never worked on a large,
dynamically-typed project) seem to want to deny that static type-checking finds
errors, rather than just saying that type-checking isnв_Tt worth the trouble when
you are writing code trapped between a dynamically typed database interface and
a string-only web interface.
Syntax highlighting (and auto-compilation) in IDEs is another example of this
principle, but on a much shorter timescale. Once you have become accustomed to
having your errors revealed instantaneously it is painful to switch back to
having to wait for a compiler to print them out in bulk one at a time.

Write Less Code (and Donв_Tt Repeat Yourself)

This is perhaps the most important and deep principle in software engineering,
and many lesser principles can be derived from it. Somehow simple statements/programs/explanations/models
are more likely to be correct. No one knows why this is; perhaps it is some deep
fact about the universe, but it seems to be true.

In software this comes into play as bugs: longer programs have a lot more bugs
so longer programs cost more.

Worse, difficulty seems to scale super-linearly as a function of lines of code.
In the transition from Windows XP to Vista the codebase went from 40 million to
50 million lines of code. To do this took 2,000 of the worldв_Ts best software
engineers 5 years of work.

The reason for this is that the only way to get real decreases in program size (decreases
of more than a few characters or lines) is to exploit symmetry in the problem
you are solving. Inheritance is a way to exploit symmetry by creating type
hierarchies, design patterns are an attempt to exploit symmetry of solution type.
Functional languages are still the king of symmetry (all of lisp is built out of
a few primitive functions). But rather than categorize these by the mechanism of
the solution, it is better to think of them as what they are: ways to write less
code.

The best way of all to avoid writing code is to use high quality libraries. The
next time you find yourself writing a web application reflect on how little of
the code executing is really yours, and how much belongs to the Linux kernel,
Internet Explorer, the Windows XP, Oracle, Java, and the vast arrays of
libraries you rely on (в__ahh the old hibernate/spring/JSF/MySQL solution, so
lightweightв_│в__).

Some of the difficulties of large programs are technical but many are
sociopolitical. Microsoft is the size of a medium sized country by income and
the size of at least a small city by head-count. Yet it is run in much the same
way as any 200 person company, namely some guy tells some other guy what to do,
and he tells you. Unfortunately they have found that none of these things work
at that scale, and I donв_Tt think anyone has a really good idea of how to fix
them.

Your problem doesnв_Tt require a small city to produce, but the principle is the
same. If your solution requires double the man-power then all the organizational
overhead will have to be developed to handle this. Furthermore the organization
will be composed of computer programmers who are often at roughly the same level
of interpersonal sophistication as that sling blade guy.

What is remarkable, though, is that to make the solution small means also making
it clear. I think that this has mostly to do with human brains. We can only
think one or maybe two sentences worth of thought at a time, so finding the
concepts that make your solution one sentence is essential. The famous haskell
quicksort is a perfect example of this. I canв_Tt help but feel jealous of the
computer science students ten years from now who will see algorithms presented
in that way. (If you donв_Tt read haskell the program just says: в__An empty list is
quicksorted. The quicksort of a non-empty list is the concatenation of (1) the
quicksort of list elements less than the first element, (2) the first element
itself, and (3) and the quicksort of list elements greater than the first
element.в__ Though, of course, the haskell version is much briefer.)

Computer Programs Are For People

в__We want to establish the idea that a computer language is not just a way of
getting a computer to perform operations but rather that it is a novel formal
medium for expressing ideas about methodology. Thus, programs must be written
for people to read, and only incidentally for machines to execute.в__
The Structure and Interpretation of Computer Programs

The wonderful thing about the above quote is that it gets less brave and more
obvious every year. We know that c/java/lisp/haskell have not one bit of power
that isnв_Tt in simple assembly langaugeв_"they only allow us to express the ideas
more clearly and prevent certain kinds of stupid mistakes. There is no program
that can be written in one that canв_Tt be written in another, and all end up as
machine instructions sooner or later (some at compile time, some at run time,
but no matter). Given this fact it should be obvious that the only reason to
have a programming language is to communicate to a person. Don Knuth wrote about
this idea, calling it Literate Programming, and created a system called WEB
which was a great idea mired in a terrible implementation[2]. The idea was to
embed the program in an essay about the program that explained how it worked.
The Sun programmers simplified this in some ways with Javadoc, but still
something was lost, since it is very hard to get any of the big ideas out of
Javadoc, or even to know where to start reading. Projects always have two links:
one for Javadoc and one for a higher level documentation which is written in
some other system. WEB created a linear narrative to describe a program that
might not be quite so straight-forward; Javadoc creates a flat list of
documentation with no beginning, end, or summary. Neither is quite what I want.

It is a small tragedy that programmers who spend so much time trying to
understand obscure code, and so much time creating new lanauges to write more
obscure code in, spend so little time coming up with the WYSIWYG version of WEB
that makes program source input look like the beautiful WEB/TEX output.

I think the best ideas in object-oriented programming also fall under this
category. The methodology of solving a problem by creating a domain model to
express your problem in is the best example. The point of such a model is to
create that level of abstraction which exists wholly for people to think in. Not
all problems are easily broken by such a method but many are. The concept of
encapsulation (you know, the reason you type private in front of all those java
variables) is another example. Both of these are to make things simpler, more
usable, to put it simply, more human.

A sort of corollary of writing computer programs for people, writing less code,
and solving the general problem is the following: write short functions. If I
could transmit only a single sentence to the programmers of tomorrow which
summed up everything I knew it would be that: write short functions[3]. When you
write short functions you are forced to break the code into logical divisions
and you create a natural vocabulary built out of function names. This makes the
code an easily readable, easily testable, set of operations. When you have done
this it becomes possible to see the duplication in your code and you start
solving the more general problems. It is sad that the best piece of software
engineering advice that I know of is to write short functions, but, well, there
it is. Donв_Tt spend it all in one place.

Do The Right Thing

This principle sounds funny when stated directly, after all who is advocating
doing the wrong thing? And what is the right thing to do, anyway?

The point is that in the process of developing software I am always facing the
following situation: I can cut corners now, hack my way around a bug, add
another special case, etc. OR I can try to do the right thing. Often I donв_Tt
know what the right thing is, and then I donв_Tt have a choice but to guess. But
more often, I know the best solution but it requires changing things. In my
experience every single factor in the software development process will argue
for doing the wrong thing: schedules, managers, coworkers, and even, when they
get involved, customers. All of these groups want things working as soon as
possible, and they donв_Tt care what is done to accomplish that. But no one can
see the trade-off being made except for the programmers working on the code. And
each of these hacks seems to come back like the ghost of Christmas past in the
form of P0 bugs, and I end up doing the right thing then under great pressure
and at higher cost then I would have done it before.

A lot of times doing the right thing means solving the more general problem. And
it is an odd experience in computer programming that often solving the more
general problem is no harder then solving the special cases once you can see the
general problem.
There is a lot of advice that argues the opposite. This line of thought says
just throw something out there, then see how it is conceptually broken, then fix
it. This argument is perfectly summarized in the в__worse-is-betterв__ discussion,
and I donв_Tt have much to add to it. Except to say this, I think that worse-is-better
is a payment scheme. With worse-is-better you get 85% of a solution for dirt
cheap, and the remaining 15% you will pay in full every month for the rest of
your software systemв_Ts life. If you are writing software that you know will be
of no use tomorrow, then worse-is-better is a steal, (but you might want to
consider quitting your job). If you are writing software that will last a while
you should do the right thing.

Reduce State

You may have heard that Amdahlв_Ts law is the new Mooreв_Ts law and that by the time
Microsoft finishes the next version of Windows computers will have like holly
shit 80 fucking cores. This means that in five years when your single threaded
program is going full tilt boogie it will be using all of 1/80th of the
processor on the machine. As a semi-recent computer science grad. student my
opinion of concurrency is в__neato.в__ But I notice the old timers at work have more
of a в__we are so totally fuckedв__ look about them when they talk about it. I think
the reason for this is this: if x is a mutable object then the following doesnв_Tt
hold in multithreaded program:

x.equals(x)

Like everyone else, I am guessing that the end game for all this will be a
massive reduction in mutable state. Those functional language people are
definitely on to something. But those of us who still have to program in
dysfunctional languages during the day need a more gradual path. The question is,
if mutable state is bad, is less mutable state less bad? or do I have to get rid
of it all?

I donв_Tt know the answer to this. But for a while now, I have been trying the
following. Whenever possible avoid class variables that arenв_Tt declared final.
See functional programming gets rid of side-effects altogether, but I know if
this is necessary. Inside a function the occasional i++ really isnв_Tt that
confusing and I am not sure I want to give it up just yet. The reason is that
method-local variables have no publicly accessible state, so as long as I am
writing short functions this temporary state shouldnв_Tt be a problem. By making x
immutable you ensure that x.equals(x). This also makes it very easy to prevent
invalid states, just ensure that either the user provides valid inputs to the
constructor or the constructor throws an exceptionв_"if you do this, and donв_Tt
have mutable state, then you are guaranteed no bad states.

I havenв_Tt figured out yet how to make all my members final just yet (or I would
probably be using haskell). It seems to me that if I want to change a Userв_Ts
email address then I need to be able to call user.setEmail(). That is because
the state of the email address is real state out there in the world that I have
to model in my program. So the domain model retains its state. But as yea of the
Java world know, the domain model is not all the code, oh no. We still have
business objects, and persistence objects, and gee-golly all kinds of other
objects. And guess whatв_"99% of the state in these objects can go. And when it
does everything gets better.

But I am only starting with this concurrency thing. I am reading this book,
which is awesome. In it you can learn about all kinds of disturbing things like
how the JVM has secretly been scrambling the order of execution of your code in
subtle ways and things like that.
Know Your Shit

Just as the workable solution is always the last thing you try, the impossibly
to diagnose bug is always in the software layer you donв_Tt understand. You have
to understand all layers that directly surround your codeв_"for most programmers
this begins with the OS. If you do low level programming you better know about
computer architecture too. But this idea is bigger that just catching obscure
bugs, it has to do with finding the solution to hard problems. Someone who is
familiar with the internals of an OS has enough of the big ideas under their
belt to attack most large software problems.
Web programmers know that all performance problems come from the database.
Naturally when you first see a performance problem in a database backed
application, you want to do some profiling and see where in the code the time is
going. After all, isnв_Tt this what everyone says to do?. You can do this, but you
might as well just save yourself the time and just log the queries that are
issued on that code path, then get the database execution plan on each, youв_Tll
find the performance problem pretty quickly. The reason is simple: data lives on
disks and disks are (say) 100,000 times slower than memory. So to cause a
problem in java you have to be about 100,000 times more stupid shit than to
cause a problem in SQL. But notice how the gorgeous relational database
abstraction layer has broken down and in order to solve the problem one has to
think about how much data is being pulled off the disk to satisfy the query. The
point is that you canв_Tt stop at understanding the relational part, you also need
to understand the database part.

The larger problem is what should we be learning to be better at this. I know
that the following things will help because they have helped me:

Learn a functional programming language

Learn how operating systems work

Learn how databases work

Learn how to read a computer science paper

Learn as much math as you can (but which mathв_│)

Unfortunately t is virtually impossible to say what will not help you solve a
problem. Will knowledge of good old-fashioned AI help you write enterprise
software? It certainly might when you implement their 400,000 lines of Java
business rules templates as 2,000 lines in a prolog-like business rules system.
Likewise, it isnв_Tt every day that I need to integrate at work, but when I have
the payoff has usually been big. If you asked me if studying something very
useless and different from computers, literature, say, would help you to solve
problems, I couldnв_Tt tell you that it wouldnв_Tt. It might be less likely then
studying operating systems or databases to pay off, so there might be some
opportunity cost, but I couldnв_Tt tell you that the next big advance wouldnв_Tt
come from someone who had divided their time between programming and literature.
I think that that is pretty much the state of our art, we donв_Tt even know what
the framework in which the new ideas will come, let alone what they might be. Iв_Tm
not sure if that is exciting or pathetic.

[1] в__Ruby is a butterflyв__. Wow dude, go outside. Look at any of the beautiful
creatures on the earth. Now go back in and look at your computer. See much
resemblance? Me either.

[2] Knuth says: в__A user of WEB needs to be good enough at computer science that
he or she is comfortable dealing with several languages simultaneously. Since
WEB combines TEX and PASCAL with a few rules of its own, WEB programs can
contain WEB syntax errors, TEX syntax errors, PASCAL syntax errors, and
algorithmic errors; in practice all four types of errors occur, and a bit of
sophistication is needed to sort out which is which. Computer scientists tend to
be better at such things than other people.в__

Just because we are better at it doesnв_Tt mean we are good at it, and even if we
are good at it that doesnв_Tt make it a good idea. Anyone who has looked at a JSP
that contained equal parts HTML, CSS, SQL, Java, and Javascript has a fair idea
what the source for WEB programs looks like. But the produced output, like all
TEX output, is absolutely stunning.

[3] Wearing sunscreen is good advice too, but computer programmers are probably
the least at-risk sub-population for skin cancer since most of them arenв_Tt white
and none of them seem to get outside enough. If in doubt write short functions
while wearing sunscreen, but if you have to give up one, lose the sunscreen.

Dynamic languages are for neat freaks not slobs

I donв_Tt know how something that fails to make the proper Foleyesque slobering
noises about dynamic typing and Ruby on Rails managed to get voted up on reddit,
but I am glad this did. It is a rather excellent article that manages to nail
the issue without really taking sides. My only complaint is this: it gets the
recommendation exactly wrong.

If you use a slobby language like python you will have to become extremely neat.
If you work on a project with more than one programmer you will have to start
documenting the types of every function in comments or you you will find
yourself having discussions like в__fred, does this method expect a list of hashes
that map ints to strings, or a list of hashes that map ints to user objects?в__.
You will also have to be very serious about your testing strategy to make sure
that you maintain the ability to refactor when you have a lot of code, otherwise
it is impossible to know when you have broken something. If you have only 60%
unit test coverage in some area, you will need to do a lot of manual testing to
ensure you havenв_Tt broken something with your refactoring.

Likewise if you work in a language like Java (or better yet, haskell!) you can
be a lot sloppier. You can make interface changes and depend on the compiler (or
a refactoring tool) to find all the various places you have just broken.

One should pick tools that correct ones own tendencies, not tools that
exacerbate problems.

So my recommendation would be this: if you are very precise about types, so much
so that you can maintain them all in your head, then dynamic languages are the
way to goв_"they will offer you flexibility and less writing out of types. If you
find yourself slipping a bit in your precision, then you want a compiler which
checks static types for you. I know that I am a slob, so I think that type
inference is in my future.

Комментариев нет:

Отправить комментарий