Concurrent Programming
Posted by dublindan on June 4, 2009
Since the advent of inexpensive multicore processors, there has been a lot of buzz about how concurrent programming is hard. So hard, in fact, that we are in a software crisis because Moores Law is coming to an end, being replaced by systems with many parallel cores. This means that, in order to take advantage of these systems, concurrent programming is a must. Yet concurrent programming is extremely difficult.
Let me ask you, is concurrent programming actually hard? Does it need to be?
My answers are yes, concurrent programming IS, indeed, hard and no, it DOESN’T need to be. Let me explain.
In my own professional experience, concurrent programming has proven itself to be hard.
I work on core network services for the telecommunications industry, with clients in the middle-east and southeast asia, amongst other areas. We develop value added services for SMS networks for mobile operators – SMS anti spam, blacklisting, throttling and so on. On a medium sized network, we must be able to cope with 2000+ messages per second – this includes a large number of rules and processes which must be applied to each message.
When our current product was still in its infancy, the architecture was very much different than it is now. We wrapped each rule and process inside a “task” and we had a “Sequential Task Processor” which processed these tasks individually, but sequentially. One day it was decided that its time to “optimise” this system by “parallelising it”. Time estimate? About 2 hours – the tasks are there, they “only” needed to be made execute concurrently! To cut a long and painful story short, about two weeks later, we were still ironing out bugs and issues and wondering why it was slow and why we had so many intermittent problems. Not to mention how difficult this made tracing through logs – the entries could now be in any order! Eventually, months later, the entire task code was thrown out and replaced with a much much simpler system where messages are processed concurrently, but the rules were all sequential. The code was suddenly easy to understand, fast and stable. Logs were now predictable too!
Another example, again from work, was that we run our system in a clustered environment. Both to increase overall throughput and to provide redundancy. To achieve this, we had a number of data structures (hash tables, trees, etc) which were globally accessible between all nodes. These “shared caches” were used to store all sorts of interesting information, including messages themselves (so that if the node died, another one can process the message instead). Conceptually, the system was brilliant! Fault tolerant, scalable, highly distributed… except not. All of these data structures needed to be manually locked. Changes needed to be pushed out to other nodes. It became our biggest source of problems and bugs. Again, eventually, after a lot of pain, we scrapped the entire concept and redesigned the system so that each node is as independent as possible. Again, doing this made the code easier to understand, faster and a lot more stable!
What did I learn from this? Simple – concurrency is extremely hard to get right! Well, I knew this already, since I read a lot of programming blogs, but this was the first time it really sunk in.
I’m sure that anyone who has ever worked on a large concurrent system has experienced similar pain. Anyone who still feels that traditional (monitor based shared state) concurrency is not hard probably hasn’t written a terribly large or complex concurrent system.
The question, however, is: does it have to be hard?
I don’t think it does. There are some techniques which can ease the pain of concurrency. The first is Transactional Memory. A good introduction can be found over on the Intel website. Transactional Memory aims to remove explicit locking by handling all synchronised accesses as a transaction and if theres a clash, the transaction can be rolled back and run again on clean data. This certainly simplifies concurrent programming, but it solves the symptom, not the cause by hiding the complexity. Brushing it under the rug, so to speak. One still needs to deal with the same issues.
Another method, which, IMHO, greatly improves concurrent programming, is a shared-nothing message passing system where messages are immutable. This works really well, because the logic can be completely lockless! If the data cannot change, once created, then theres no danger of concurrent writes. Brilliant! Concurrency is not so hard now, is it? Highly concurrent languages like Erlang use this model and they do it pretty well.
But no, shared-nothing message passing is far from a “Silver Bullet” of concurrency. Firstly, not all problems are suited to a shared-nothing model. Secondly, and more importantly, the programmer must still manually handle the concurrency. Must still manually decide what should be concurrent and what should be sequential. The programmer must still decide when to process messages or when to post messages. In fact, this is the real problem with concurrent programming – its too manual. Locking causes problems, but its not the root problem – as I said, locking can be elliminated using Transactional Memory or message passing.
So, is concurrency still hard? Yes, it most certainly is. How can this be solved? The answer to that is by eliminating the programmer from the decisions of what should be concurrent, when and how. The compiler should determine this and concurrency should be automatic and transparent.
Easier said than done, of course and this is the holy grail of concurrent programming research! Just take a look at recent ACM SIGPLAN publications. So far, there has been little real success…
But I said that I believe concurrency doesn’t need to be hard! I still believe this, even after all that. In fact, I actually believe that concurrency is a solved problem! Not only that, but that it was solved a long long time ago.
Outrageous claim? Maybe, but let me explain.
In the book, Concepts, Techniques and Models of Computer Programming , Peter Van Roy writes a few interesting words on the topic. In the chapter which deals with internal state (for example, in object oriented programming, objects contain internal mutable state) he writes that in the next chapter he will reintroduce concurrency to produce a stateful concurrent model not unlike what we are used to in mainstream languages. He then states that this is an extremely dificult model to program in. He agrees that concurrent programming is hard!
In the introduction to the chapter on declarative concurrency, however, he asserts that “concurrent programming need not be hard”. Wow! Where did that come from!? Reading on (or understanding the chapter title: “declarative concurrency”) provides the answer: dataflow.
Dataflow is an old, yet often forgotten or ignored, programming paradigm and I believe it holds the answers, where concurrent programming is concerned. Why? Because it can handle concurrency automatically, implicitly and transparently. Programmer not required! In effect, dataflow programming is the holy grail of concurrent programming. I won’t explain here why – dataflow deserves its own dedicated blog post – but I will assure you that dataflow is real and exists RIGHT NOW.
In fact, in certain circles, its even quite popular:
- LabVIEW, commonly used for data acquisition, instrument control, and industrial automation by industry and by the scientific community.
- SCADE & Lustre, this is used in the defence and aerospace industries and is also used to run big industry operations like nuclear power plants.
- VHDL & Verilog, hardware design languages used to program FPGA’s and ASIC’s.
- I almost forgot the most well known dataflow systems of all!! One everybody is familiar with: spreadsheets.
LabVIEW and SCADE are graphical dataflow systems. VHDL and Verilog are textual dataflow languages. All four are used fairly heavily and have existed for quite some time. There are many more. I find it interesting that dataflow was the chosen paradigm for critical applications such as defence, aerospace, hardware design and power plants! If its good enough for them, its surely good enough for us mere mortals.
This proves that not only is dataflow real and available RIGHT NOW, but it also implies that there is, indeed, value in using dataflow, or these languages wouldn’t be popular in their domains.
The next step, in my opinion, is a general purpose dataflow language which is tailored for mainstream programmers and caters for general programming tasks. This langauge must automatically scale across multiple cores (or even multiple computers), be easy to learn and use and be a productive language to develop software in.
I will talk about this in a lot more detail in future posts.