Generalization vs. specialization in software

In societies, there are always generalists and specialists. Generalists can do lots of different things with a mediocre level of quality, sometimes surprisingly well, while specialists utterly excel at one particular thing.

Suppose we had a simple distributed system that accomplishes the completion of different kinds of jobs. The way we usually build these is to have one component for each kind of job, or one component for each stage a job might be currently “in”. This has many benefits, one of which is handling load — it’s horizontally scalable, so if you need more “workers” for a specific kind of task, you can just spawn new components of that same kind.

This is akin to specialization — each component does one thing, and it does one thing well. Simple, and easy to debug.

Would there be any advantage to a system where some of the components are “generalists”? Meaning, they can do everything more quickly but not necessarily as well? Perhaps there is a scenario in which network latency degrades, or there is congestion of some kind, and to compensate, you need to get the job done more quickly but with less fidelity or something.

You could then respond to any intermittent conditions by throwing more work to the “generalist” components, until the degrading condition is resolved and all the specialized workers can be fully utilized again.

(This sort of assumes that the specialized components somehow require “more” of something than the general ones, which may or may not be true. More CPU power, more time, more memory or bandwidth.) Anytime any of these is degraded, bring in the generalists.

Text Programming

I was thinking about this last night, and how the spreadsheet is a really good example of this done correctly, and what other scenarios you could make a similar program for.

For example, doing text manipulation isn’t easy on a spreadsheet — say someone emails you a massive, comma-separated list of email addresses. It would be sweet to be able to open that in a “text programming” environment where you do similar operations on text that you would do on columns and rows of numbers.

I keep, in a text file, a running list, day-by-day, of how I spend my time with Bloc. It would be sweet to be able to highlight the whole thing, and somehow tell it that the bold text indicates days, and that days are separated by blank lines. Then, instead of just searching “tex”, the program could be smarter and realize that these are actually dates, and allow them to be overlaid on a calendar or something.