Reactive Cloud Actors: No-nonsense micro-services

bananas · on April 13, 2014

My bullshit detector was tripped by the number of buzzwords.

What does this actually solve versus say traditional MQ (EIP stuff), service bus systems and normal SOA?

I've dealt with NServiceBus extensively and really big scale .Net stuff and if it's anything remotely close to that architecture, I'd rather hang myself than use it again.

aliostad · on April 13, 2014

Yes, so many buzzwords together - I admit.

The difference is replacing Commnad/Query with events. Replacing Imperative with Reactive.

Did you read the whole post or you were put off by the buzzwords?

bananas · on April 13, 2014

Perhaps it's because it is Sunday but I phased out after the first couple of paragraphs. I've read it now and am still unaware of why this is fundamentally better.

Command/Query pattern is an imeplementation concern. The architectural pattern that backs it up isn't necessarily related (for example event sourcing). At that point it isn't imperative so why is this different? It just appears to be fragmenting lots of concerns into even smaller ones which are vastly more complicated to debug and manage.

The bold highlighted text of: "in a reactive actor system, each actor only knows about its own Step and what itself does and has no clue about the next steps or the rest of the system"

Isn't that the same for a message channel but without a transactional guarantee? (going back a decade or more to EIP).

aliostad · on April 13, 2014

No it is not the same. First of all, CQRS is applying CQS to the architecture, so it is an architectural decision. When you use commands, you know about the interface of the endpoint receiving your message. In case of an event, you do not have to know anything - and this is the whole premise of this post. In the article to come, I have explained the differences more with regard to EIP as well. A half-draft available at https://gist.github.com/aliostad/55b652e44dfe87d44444

bananas · on April 13, 2014

Thanks. I'm sceptical (having been through several similar systems and ripped them out) but I will read your draft later.

mml · on April 13, 2014

He's certainly on to something. As he pointed out, it's been around since about '73, and many people are happy with the design of such systems. Amazon even has an actual for-real implementation of one (see Simple Workflow), he should probably try it out before rolling his own.

It really is an elegant way to do things from what I can tell. It forces you to keep your state in one, carefully managed place (hairball), rather than sprinkled around god-knows where, being poked and prodded by god knows what....

IMHO, There should be more talk about the design of actor based systems than the glut of OO yammering we are all exposed to.

zenbowman · on April 14, 2014

Another case of reinventing the wheel and calling it something new. This is just a reimplementation of what used to be called the "blackboard architecture".

http://en.wikipedia.org/wiki/Blackboard_system

I don't see how broadcast is more "reactive" than singlecast. And the solution to tight coupling in actor systems has already been thought of - it is a message broker. This is just an actor system where every actor shares a single message broker.

Its cool that you built your own system for actors, and I certainly wouldn't discourage people from reimplementing something for the sake of learning, but every criticism you have of traditional actor systems is inaccurate. Strangely enough, you criticize Orleans in particular, which is especially good about tackling the coupling problem by making actor references virtual.

"Perhaps the central problem we face in all of computer science is how we are to get to the situation where we build on top of the work of others rather than redoing so much of it in a trivially different way."

-Richard Hamming

mamcx · on April 13, 2014

The big problem with this kind of stuff is how debug them. Also, that despite the intention, all is really coupled: Don't make sense to cancel a order before create it, so the order of the actions are important.

The benefit is (only ?) easy of scaling, IMHO. How solve this? I understand that GO keep the ordering of the calls and is not harder than doing imperative...

aliostad · on April 13, 2014

Well, first of all, debugging while developing an actor is actually easier: you just have to debug an actor in isolation and make sure it does the right thing and sends the right event in response to the incoming event.

In terms of debugging issues when in live, you have to rely on tracing and let something such as Logstash handle search, etc.

The ordering of the workflow is preserved. As I explained, FraudCheck is only ever done after an order's payment has been authorised.

Hope this makes sense.

twic · on April 13, 2014

Well, first of all, debugging while developing an actor is actually easier: you just have to debug an actor in isolation and make sure it does the right thing and sends the right event in response to the incoming event.

No. No, this is terrifyingly incorrect. Making each component do the right thing in isolation is easy. Most of the difficulty in systems - in developing, testing, and debugging them - arises from what the components do when they're together, because it turns out your ideas about how they need to behave in isolation were mistaken.

mamcx · on April 14, 2014

Exactly. Sometimes is necessary to step into the call of another function/actor, see what is is doing, note is necessary to step into more into a deeper function, etc.

Or stop when a certain value arrive, and inspect the code. If this could be done with actor/csp, then is good idea to jump onto it.

th0br0 · on April 13, 2014

Isn't this what http://akka.io does?

aliostad · on April 13, 2014

Yes, but they do them differently. BeeHive is much more opinionated about the message (must be event and has an enveloper) and the Basic Data Structures.

Also BeeHive avoids stateful actors. State is always persistent in HA queues or Basic Data Structure stores.

platz · on April 13, 2014

Any comments about the recent Orleans announcement? There must be differences but also I assume a lot of similarities

As you mention there in no state in beehive actors, so that is one difference.

http://blogs.msdn.com/b/dotnet/archive/2014/04/02/available-...

hibikir · on April 13, 2014

I thought Rails had taught us that extremely opinionated infrastructures are doomed to at best be influential, but not widely used.

Sometimes actors have to be stateful. Sometimes they should be the interface to large, complex computations that take 20 minutes to run.

The market is choke full of techs that solve very narrow cloud problems. What we need is something that isn't so terribly narrow. Microservices is very, very narrow. Even people working for Fred George admit to that.

karmajunkie · on April 13, 2014

I'm having a hard time parsing this—Rails is something I would consider to be both highly opinionated and widely used, but its unclear where your statement falls. I'm not crazy about some (most?) of the opinions, but that's a matter of some debate.

> Microservices is very, very narrow. Even people working for Fred George admit to that.

Its hard to make this statement without first agreeing to a definition of microservices. I'm not convinced there's wide agreement to such a definition yet.

cordite · on April 14, 2014

Some have attempted this, for real or for giggles.

http://www.russmiles.com/1/post/2014/04/the-microservices-ma...

karmajunkie · on April 15, 2014

upvoted that :)

th0br0 · on April 13, 2014

How exactly do you mean "stateful actor"? Whether an Akka actor has a state is actually up to you!? E.g. you could have your router fire up a new actor to process each request up to a maximum number of n concurrent actors etc...

mercurial · on April 13, 2014

Fun. I accidentally used this pattern (minus the timestamp, since it wasn't relevant for my use case) to implement a non-distributed data processing framework on top of POE. It does work pretty well, however, you need to have a way for the actor to block until enough messages have arrived (sometimes buffered operations are the most efficient). I'm not sure how you would do that with the model proposed here.

mantrax4 · on April 14, 2014

This guy seems uneasy with Actors. His example is wrong and then he tries to solve it with a solution that's worse than the problem he thinks he has.

Let's untangle that stuff. Actors are about conversations between folks organized, like you would find in a good company. So let's use the conversation metaphor.

His example of "classical" actors:

- PaymentActor (to herself): Someone told me to process a payment from this order. I'm done.

- PaymentActor: Hey, FraudCheckActor, check if this order I'm giving you is a fraud.

- FraudCheckActor (to herself): Damn, this is a fraud.

- FraudCheckActor: Hey, CancelOrderActor, cancel that order I'm giving you!

- CancelOrderActor (to himself): I canceled the order.

Now I see an immediate problem here. This is anarchy. No one is in charge of the whole process, everyone runs to talk to someone else, and passing around that order data and there's no conversation happening here. Would someone in a real company organize things this way? Not in a good company, for sure!

Let's see what his solution is:

- PaymentActor (to herself): Someone told me to process a payment for this order.

- PaymentActor (screaming): Everyone! Payment is complete for this order! (screams the order info to everyone)

- FraudCheckActor (to herself): I heard that, I'll check it. Damn, this is a fraud.

- FraudCheckActor (screaming): Everyone! This order is a fraud! (screams the order info to everyone)

- CancelOrderActor (to himself): I heard that, I'll cancel that order.

- CancelOrderActor (to himself): I cancelled it.

Now I see an even bigger problem here. It's still a god damn anarchy, but now also everyone is screaming at everyone, including the entire order data, or talking to themselves. How is this better again?

Plus notice, there's still state in the system - the subscription setup. The only thing achieved here is that it'll be nearly impossible to reason about how this thing works once it becomes more like a real-world scenario.

Here's how you design it with Actors, once you have some experience:

- OrderActor: Hey, PaymentActor, process this payment and tell me what happened.

- PaymentActor: Hey, OrderActor, I processed it. It's good!

- OrderActor: Hey, FraudCheckActor, check this order for fraud and tell me what you found.

- FraudCheckActor: Hey, OrderActor, damn! This is a fraud!

- OrderActor: Hey, CancelOrderActor, cancel that order and tell me when done.

- CancelOrderActor: Hey, OrderActor, I canceled it!

Now what do we see here? We see structure. We see clear responsibilities. We see teamwork. No screaming and no chaos. Only OrderActor has the entire order data. He's passing along only the needed bits to each actor according to their responsibilities.

And best of all, PaymentActor, FraudCheckActor and CancelOrderActor are not coupled to each other. They just respond to their events, limited to their own responsibilities.

Good job OrderActor & co.

OrderActor is a supervising actor (that's an actual term). In my example, he's managing the workflow and he is stateful and as you see it's not that scary and out-of-order as the author is telling you. After all, he needs to know the payment went through, and then remember to ask for a fraud check and know what to do after that.

(Note: technically the supervisor in this example doesn't have to have state, he can just react to the events by Payment, FraudCheck, and Cancel, but this only works in this very simple example. Real world scenarios never fit into a completely stateless world. We're talking about the order state here!)

The other actors have no state (that is seen in this example). So proper design isolates concerns, and may have stateless actors, but stateful actors are essential to the model.

Of course, the linked article demonstrates how you can shoot yourself in the foot even with the best model out there. I find it ironic that the author was unhappy with over-complications and coupling, yet coupled everything in a complicated way with his "subscriptions" model, instead of simply using a supervisor.

karmajunkie · on April 14, 2014

There's nothing particularly wrong with his design, except for appropriating the terminology from the Actor model for what is basically CQRS.

You're just advocating for an imperative solution which may or may not be asynchronous. Nothing particularly wrong with that, but its not the only "right" way to do it. In point of fact, the underpinning of languages like Haskell is that state is conveyed by the arguments to functions (i.e. events), and the terminal "state" of a process is just the integration of those events over time or composition of functions. State is useful as a convenience. Its not essential.

Also, not for nothing, but oversimplified metaphors like shouting in an office are just obnoxious, and also irrelevant. Shouting might be a problem in an office. Software, it turns out, is capable of paying attention to whatever it wants to pay attention to and ignoring the rest.

mantrax4 · on April 14, 2014

Actually, shouting in the office means you coupled the FraudCheckActor to a PaymentCompleteEvent, and you coupled CancelOrderActor to FraudDetectedEvent.

Your lower level actors are still coupled with each other, through quite specific events, and on top of that, dependent on a subscriber setup.

Instead of declaring your own protocol and accepting messages in that protocol, now you need to understand everyone else's protocol (in the form of their event types) and react to it if you're subscribed to it.

It's the worst of both worlds.

Now, sure, you're right. Actors aren't like people in an office. Actors can't complain when you entangle them in a hot mess, they just carry on with whatever they're told to do.

But the more experience I gain, the more I see the systems of the "real world" are in fact exactly like the systems of the "computer world". If something makes no sense in an organization made of people, then almost inevitably it'll make developers cry when having to figure out the sequence of events when implementing the same organization as a set of actors.

Take it as a useful heuristic and humor my analogies. You might find a nugget of gold there.

Also, there's nothing blasphemous if my supervisor can be implemented as synchronous imperative code. When people hear about a new paradigm, they believe they have to abandon everything and do things differently for no reason. No. Find some balance, what you knew still works, don't throw it out, just add the new benefits Actor offers to you.

I.e. while Actors mean we can have objects that do things that you can't do in imperative/synchronous fashion, this doesn't mean we should never use actors for something that might be done in imperative/synchronous fashion. 90% of the interactions between actors still look like a request/response pattern. But those 10% when things don't look like a request/response really matter when you need that ability.

Also keep in mind Actors don't care how far its subordinates are. In synchronous programming things have to happen fast, or the thread blocks. With actors a request may be sent this week, and you may get an answer next month. Things can happen very fast, or very slow, the model is not biased. How fast is fast enough is now entirely up to your app's domain logic.

Also, some Actors implement persistence (which is far more trivial than trying to persist the callback graph or call stack in "normal" code), so they don't even care if the system shuts down and it has to start over from its last state.

Imagine OrderActor sends a call to FraudCheckActor and then the system crashes, or (if it's a business workflow process) it's just powered down for the night. When the system is restarted, OrderActor is already waiting for that answer from FraudCheckActor.

Do that with imperative/synchronous code.

karmajunkie · on April 14, 2014

Yep, all of that sounds great. And almost all of it applies to CQRS with event sourcing (which IME is the typical implementation). Using events, the events ARE the protocol, and because the events are publishable by any unit, the coupling is no greater that that offered by a method call, in knowing the name and parameters required by the method.

Moreover, modeling with events offers me a lot of flexibility in how I translate the real world domain into code.

I think I agree that if you're going to use actors, what you're saying makes more sense, but I honestly don't have enough production experience with that model to say yay or nay. But with the exception of event storage (which addresses your issue of figuring out what happened in what sequence) and terminology aside, the OP is proposing something which is isomorphic to CQRS, and that I have plenty of experience with, and it works just fine.

aliostad · on April 14, 2014

Imperative style means that A has to know about B. In needs to know what is the next step. With reactive it does not. This creates the decoupling. And also not everyone else's protocol. An event is an important business milestones for an order which has nothing to do with the implementation, and if you look, almost all of them are as such.

mantrax4 · on April 14, 2014

Everyone has to know about the business process logic to me is the opposite of decoupling.

I'd rather limit coupling to a single actor, and let the other actors be reusable.

You can't avoid someone being in charge. It's an argument as old as the universe. Do companies need a CEO? Do we need a government?

Well, someone with an imperative role always emerges sooner or later. Even in nature, only the simplest of animals don't have a centralized nervous system. The subscription model described in the article won't survive complexity, because it's an anarchy. As nature and business shows, imperative is not a bad thing. It allows concern separation.

It's not CancelOrderActor's concern why the order is canceled. The reasons for this may be hundreds. CancelOrderActor's concern is how to cancel the order. And sure, if CancelOrderActor was A.I. we could chat about why OrderActor's practice is good, but until that changes, CancelOrderActor just takes orders. It doesn't decide what's good for the order, as it's not its job.

Someone once said to me "good software design is about separating things that change a lot, from the things that don't change a lot".

Business process changes a lot. Would you rather update all actors' bindings when this happens, or just one supervisor?

Well I might be naive, but I prefer things this way: the business process changes? I change the business process actor. The payment method changes? I change the payment actor.

And not... The business process changes? Change all the things! Because it's not just a matter of whether CancelOrderActor is bound to the FraudEvent by subscription.

It's what the CancelOrderActor does as a result of that event, and that's still defined inside that actor. And in the real world, the options are many, because no one creates an actor for just the purpose of canceling an order anyway. It's mindless boilerplate. An actor has a domain. And how that domain changes in reaction to unrelated events depends on a lot of things.

karmajunkie · on April 15, 2014

They're equally coupled. In one system, the coupling flows outward from the business process—the business process must know about the other services. In the other, the business process is domain-driven and knows nothing about processes that are driven by the events that happen within that space. The outer processes (e.g. the fraud service) know about things that happen in the domain (or rather, what messages the domain contract will send).

I've actually written a fraud detection system this way. Worked beautifully. Within the service boundary I suppose you could say it was driven imperatively, but I don't think that's what you're talking about.

> Business process changes a lot. Would you rather update all actors' bindings when this happens, or just one supervisor?

In point of fact, a lot of business processes never, ever change. The way they're handled may change, and at that point, I'd prefer for that concern to be nicely packaged into a service or actor or listener or whatever you want to want to model it with where my domain logic isn't concerned with it at all. And if my domain does change? I shouldn't have to change a fraud detection service at all. And in neither your model nor the OP's would I have to.

> And not... The business process changes? Change all the things! Because it's not just a matter of whether CancelOrderActor is bound to the FraudEvent by subscription.

I fail to see how his process has this problem and yours magically doesn't. You're making a lot of hay out of whether his "reactive" events or your "imperative" messages are better or worse, but they're the same thing: sending messages. The only difference is whether broadcast is better than a direct message sent to a unit.

mantrax4 · on April 15, 2014

> I fail to see how his process has this problem and yours magically doesn't. You're making a lot of hay out of whether his "reactive" events or your "imperative" messages are better or worse, but they're the same thing: sending messages. The only difference is whether broadcast is better than a direct message sent to a unit.

No, that's not the only difference. Notice that I have 4 actors, only one is burdened with knowing how the process works. Here's what the other 3 actors see from their PoV:

- PaymentActor gets "process payment" event.

- FraudCheckActor gets "check for fraud" event.

- CancelOrderActor gets "cancel order" event.

This is what I call decoupling. 3/4 of your system is completely isolated from each other and reusable in different business processes. And the one actor who is "burdened" with knowing about the process (OrderActor) is in fact, responsible for the process (and you can open its code in one place, read it, reason about it, and change it in one place, not all over the graph).

And his model:

- PaymentActor gets "new order" event.

- FraudCheckActor gets "payment complete event" event.

- CancelOrderActor gets "fraud detected" event.

Those 3 processes are coupled with each other, as they need to interpret each other's events.

If you want to move to another system where the fraud check doesn't happen right after the payment, you need to edit the FraudCheckActor's "reaction" to that event.

In my case FraudCheckActor is only reacting to the "check for fraud" event and so it doesn't have to have its reaction code edited.

There's nothing "magical" about it, just better concern isolation. If you still see this as the same, that's all I could do to explain things :)

karmajunkie · on April 16, 2014

I feel like this would be a much better discussion somewhere else—i have no idea whether you're still reading this thread. :)

I get what you're saying. The way I write my systems when doing evented/reactive logic is that there is a layer between the event and the message to, say, the FraudActor, responsible for mediating between domain events from different bounded contexts (to appropriate a term from DDD) and the system in question.

The main criticism I have about your methodology is that I reject the notion that the OrderActor should be responsible for sending messages to the fraud system. An order's responsibility should be maintaining information about the order and its status—not validating whether its fraudulent, which (if the fraud detector was worth its salt) would involve a lot of historical information about the ordering party, payment information, etc—none of which should bleed into the order system, and which is the kind of thing the fraud system assembles by listening to events from all over the place. In that scenario, were the fraud system's interface to change, there would likely be changes that have to cascade to every system that touches it (regardless of whether its developed with an actor model.)

I realize you probably wouldn't, in reality and given realistic requirements, architect the system entirely the way you've outlined before this. But given the constraints of a blog post, I think the OP can be forgiven the oversimplification of the example, and you (I think) should also be able to see that his implementation has merit in real-world scenarios.

mml · on April 14, 2014

This is pretty much exactly how AWS SWF handles it.

The java implementation of the workflow "decider" (supervising actor) is somewhat interesting as well. It actually creates a bunch of future/promise objects and replays the workflow from the top as the promises are delivered (which of course has the implication that workflow decider processes must be deterministic!).

I suspect, but don't know, that AMZN has been dogfooding this product in a big way for quite some time.

(full disclosure: just spent some time with the swf guys in seattle)

On the other hand, It seems like the long way around to inventing Erlang ;)

orasis · on April 13, 2014

Sorry, I couldn't get past the terrible whitish text on blackish background. Instant headache.