Securing Streaming Data Over the Web
“Security is the chief enemy of mortals.” ― William Shakespeare
“The user’s going to pick dancing pigs over security every time.” ― Bruce Schneier
Take Me to the River
It’s a real-time world. Enterprises live in real-time. Business processes happen in real-time with live, streamed information passing from app to app, server to server.
These types of business-critical streaming systems apply to a vast number of use cases. Today’s data analytics doesn’t wait for overnight crunching or hours of offline study. Mention the word “batch” and you’ll get raised eyebrows and derogatory comments about your old-fashioned taste for classic rock music and intense hatred for selfie sticks.
Many of these on-demand, streaming processes occur in and outside the firewall, among on-premises and off-premises cloud infrastructures. Your enterprise, partners, customers and entire ecosystem depend on many of these real-time events.
Historically we have seen a huge trend with programmatic ecosystem integration to address this cross-firewall connectivity. The hipsters have proclaimed this wave as the “API Economy” (btw, anyone want to buy my “Web Properties”? They’re near a soothing digital stream). Major enterprises are rushing to extend their businesses over the firewall with programmatic interfaces or APIs. This type of approach has potentially more rewarding business implications for additional revenue streams and deepening the customer engagement. There’s no question this is a valuable evolutionary trend
Go With the Flow
Thousands of these public and private B2B APIs from such companies as Amazon, Twitter, Facebook, Bloomberg, British Airways, NY Times and the US Government are now available. A quick visit to the very popular ProgrammableWeb indicates the rapidly growing numbers of APIs that connect to very useful services.
However, many of these APIs primarily use a heavyweight, non-reactive communication model called “request-response” initiated by the client application using the traditional, legacy network plumbing of the web, HTTP.
Alternatively some of these companies and others have been recently offering modern, streaming APIs for browsers, mobile devices and embedded “Internet of Things” applications. We are seeing applications in logistics, health-monitoring, smart government, risk management, surveillance, management dashboards and others offering real-time distributed business logic that provide a significantly higher level of customer or partner engagement.
Cry Me a River
However, there are huge security and privacy issues to consider when deploying streaming interfaces.
Offering these real-time and non-real-time APIs seems risky despite their business potential. Not only do they have to be reliable, robust and efficient in a mobile environment, they have to be encrypted, fully authenticated, comply with corporate privacy entitlements and deployed in a multi-DMZ environment where no business logic should be deployed. And certainly no enterprise wants any open ports on the internal, private network to be compromised by the “black hats”. If we could just solve this last one, we could avoid creating replicant and expensive systems in the DMZ for the purposes of security and privacy.
Here’s just a short list of deployment concerns to be aware of for offering streaming (and even conventional synchronous) APIs.
Streaming data must be able to be sent to (or received from) all types of mobile devices, desktops and browsers. It may be an iOS, Android, Linux, Windows or Web-endowed device. Your target device may even be a television, car or perhaps some type of wearable. Services and data collection/analytics must use consistent APIs to provide coherent and pervasive functionality for all endpoints. Using different APIs for different devices is inelegant, which we computer geeks know means more complexity and more glitch potential. And it means more wasted weekends lost to debugging on your Linux box instead of having a few of those great margaritas and tequila shots at that hip TexMex bar downtown.
HTTP was not really designed for persistent connections that are needed for streaming data. Yes, you can twist and fake out HTTP for long-lived connections and use Comet-style pushes from the server and get something to work. But let’s face it, after you’re done hacking, you feel good as an engineer but you feel really lousy as an architect… and real nervous if you’re the CTO.
The typical networking solution for streaming and persistent connections in general is either to create and manage a legacy-style VPN or, open non-standard ports over the Internet. Since most operations people enjoy the comfort of employment, asking them to open a non-standard port will either have them laughing hysterically or pretending you didn’t exist. Installing yet another old-fashioned low-level VPN doesn’t seem fun either. You have to get many more management signoffs than you originally thought, and have to deal with mind-numbing political and administrative constraints. Soon you start to question your own sanity.
“And what about our IoT requirements?” bellows your CIO during your weekly status meeting (and its a lovely deep bellow too). Remember streaming needs to be bidirectional. While enterprise-streaming connectivity is primarily sending to endpoints, IoT connectivity is sending from the endpoints. A unified architecture needs to handle both enterprise and IoT use cases in a high-performance manner.
As with most business-critical networking topologies, any core streaming services deployment must be devoid of business logic and capable of installation into a DMZ or series of DMZ protection layers. You need to assume the black hats will break in to your outer-most DMZ, so there shouldn’t be any valuable business or security intelligence resident in your DMZ layers. At the least, you should try to avoid read-only replication copies of back-end services in the DMZ as much as possible… because its yet another management time and money sink.
As your ecosystem grows (and shrinks), connectivity must adapt on-demand and take place over a very reliable connection.
Leveraging the economies and agility of the web and WebSocket is phenomenally useful, but automatic reconnection between critical partners over the Web is even more so.
Just for the record, production conversations that traverse open networks must be secure via TLS/SSL encryption. So always use secure WebSocket (wss://) and secure HTTP (https://) for business purposes. Nuff said.
Of course, users must be checked to confirm they are allowed to connect. Instead of dealing with low-level access via legacy VPNs that potentially grant open access at the network layer, it is significantly more secure to only allow application-service access. This Application-to-Application (A2A) services connectivity (using standard URIs) is a tiny surface area for the black hats, which btw, becomes microscopic with Kaazing’s Enterprise Shield feature. This feature shuts down 100% of all incoming ports and further masks internal service and topology data. Yes, I did say 100%.
Once a user is fully authenticated and connected to your streaming service, what operations are they entitled to perform? In other words, their access-control rights need to be confirmed. Again ideally this type of control should not be in the DMZ. Telling the operations team to incur several weeks of headaches getting corporate signoffs because you need a replicant Identity subsystem in the DMZ will not be easy. Don’t expect an invitation to their holiday party after that request.
Stream Protocol Validation
Real-time data need to be inspected for conformance to A2A protocol specifications and avoid injection of insecure code. Any streaming infrastructure needs to guarantee any application protocol must follow the rules of conversation. Any data in an unexpected format or in violation of a framing specification must be immediately discarded and the connection terminated.
There are certainly additional issues to consider for streaming data for your B2B ecosystem. Performance, scalability, monitoring, logging, et al, are equally important. We’ll cover those in a future KWICie soon.
Sittin’ on the Dock of the Bay
If you’re attending AWS re:Invent 2015, please stop by the Kaazing booth (K24) to say hello. I’m always interested to chat with customers and colleagues about the future of cloud computing, containers, autonomous computing, microservices, IoT and the unfortunate state of real music.