Reading some papers for my class, and in particular Andrew's "Reactive Objects" paper, hst night, has stimulated a few thoughts about Infopipe DSL issues. I thought I'd write these down and send them out to stimulate further thoughts and discussion.


It occurs to me that streaming applications are fundamentally
I/O driven. Well, all programs are, I guess, but I/O seems to be
particularly prominent in streaming applications since they are
all about moving data from one place to another often with some
transformations along the way and some timing constraints. So,
we need to hide the complexity of programming I/O. The complexity
we want to avoid includes approaches such as (a) using blocking I/O
calls with multi-threading, (b) using non-blocking I/O calls with
call backs, and (c) polling, perhaps timer driven. We need to be
able to support concurrent waiting, within a single Infopipe component
and across Infopipe components. We also need to support active and
passive sources and sinks. The reactive object approach seems to me
to be a useful starting point for all of this.


Obviously, the Infopipe abstraction lends itself nicely to pipelined
concurrency - meaning that conceptually, one stage of a pipeline,
represented as an infopipe component, is processing one item at the
same time as another stage (component) is processing a different item.
We need to support this kind of concurrency virtually (on a single CPU)
and truely (on several CPUs). True concurrency includes running different
components on different CPUs of a shared memory multiprocessor, and
running them on different nodes in a distributed system. We need to do
this without forcing the application programmer to deal with the complexity
of local or remote inter-component communication and synchronization.

So far, I have said nothing about concurrency within an Infopipe component.
But we may also want to support this. In this case the concurrency may be
constrained by ordering constraints on the output. For example, an encoder
component may encode two successive video frames in parallel on an SMMP,
but must output the first before the second. For this kind of concurrency,
the Infopipe abstraction doesn't help us as much, but the property
specifications might be a good starting point for generating synchronization
approaches with different behaviors.


We need to think about what synchronization means in a reactive model.
In a single CPU system one might make event handling non-preemptable
and quit worrying about synchronization. However, we will have to support
true concurrency, so we need to think about synchronization. It occurs
to me that the synchronization for the pipelined concurrency case always
follows a producer-consumer pattern. This is forced upon it by the Infopipe
abstraction. It also occurs to me that this passing of items among
components may be viewed as adding and deleting items from a linked list.
For certain tasks like this there are some highly efficient non-blocking
synchronization approaches that run well on shared memory multiprocessors.
If a DSL compiler could generate this kind of code automatically that would
save a lot of programming complexity. We need to look closely to see if this
stuff is appropriate though.

For true concurreny within a component we need to look at high performance
approaches to synchronization. Taking a look at techniques like RCU and
thinking about whether it would be possible to automatically generate
code for certain synchronization patterns, would be an interesting research

DSL Compiler

If the DSL compiler could generate the code needed to hide concurrent
I/O waiting within and across components, deal with active and passive
I/O, synchronize data exchange among producer and consumer components,
and marshal data exchange across networks, we would have gone a long way
toward achieving our goal. The compiler and runtime also need to somehow
schedule the execution of pipeline components in order to satisfy the
timing and prioritization properties. ...


If we implement Infopipe components as reactive objects
or event handlers we will need to consider how to do
event dispatching. This is sort of the equivalent of a
scheduling policy for a thread-based approach. It seems
to me that event dispatching policy is going to be application
specific and will probably encompass notions of fairness,
timeliness and graceful degradation under load, as well
as performance issues having to do with making best use
of the underlying resources (processor/cache affinity etc).

This view also raises the question of whether buffers are
components at the same level as other infopipe components,
since for the event dispatching policies mentioned above
one would have to have event queues/buffers between
components, with the servicing of those queues/buffers
being determined by the dispatching policy. In such a system,
do you think it is more sensible to view Priority-Progress-Streaming
as an Infopipe component or an event dispatching policy?

-- Jon