I did this stuff professionally for about a decade... the short version is, consider the nodes of the graph. Stuff breaks because the front end can't reach the back, the back can't take a message off the queue, the database stopped taking connections, etc. IE: most apps of any size or complexity have reliance on other systems or infrastructure components, and these becomes things that break, timeout, don't scale, etc. That's a good starting point.
I did this stuff professionally for about a decade... the short version is, consider the nodes of the graph. Stuff breaks because the front end can't reach the back, the back can't take a message off the queue, the database stopped taking connections, etc. IE: most apps of any size or complexity have reliance on other systems or infrastructure components, and these becomes things that break, timeout, don't scale, etc. That's a good starting point.