• Ingen resultater fundet

IT Complexity

In document The Strategic Case for Cloud-Native (Sider 65-68)

4. Empirical Analysis

4.5.3. IT Complexity

While technology openness and ecosystem sharing first and foremost presents beneficial impacts of value creation and capture through CNAs, the resulting IT heterogeneity leads to an increased complexity at Zalando. The internally developed self-service platform STUPS provided the open access to CNA solutions across Zalando’s independent product teams and resulted in an increased heterogeneity of the IT portfolio, as Z3 reflects:

“What ended up happening was that every team went in all these different directions.

[…] we ended up trying to create a system [STUPS] that did everything, but nothing particularly well” (Z3, 03:06-03:23).

With the freedom and self-governance allowed on STUPS the complexity of the application landscape in Zalando increased radically (Z1, 46:42). In comparison with monolithic systems, CNAs are more complex in terms of system architecture. Within the modular design of CNAs, the overview over mutual dependencies, e.g. on mutual data sources, becomes increasingly challenging. Z4 describes the complexity of microservice architectures:

“[…] you're having this one-to-many relationship that spreads out too many parts of the system. […] Furthermore, what you also introduce the so-called n+1 problem, where [one service] calls another service. And you don't know what that service has to call either to fulfill your request” (Z4, 16:57-17:35).

The missing overview over service dependencies may lead to a decreased application performance, e.g. by lengthy service feedback chains:

“So maybe that service is calling another service, and that service is calling another service, and you end up having these huge cascades of chains. Which then have to all finish to come back to you, and that connection is very slow” (Z4, 17:35-17:50).

As a result, the flexibility advantage of CNAs over monolithic applications with high dependencies is diminished when CNA modules and its data are largely synchronized, as stated by Z4:

“Let's say a request comes in to service A and that service A [requests] service B for data, gets the data back and gets the answer out. [...] you will end up with a distributed monolith. So, you basically have none of the availability upsides, but you will get all […] complexity downsides with it” (Z4,14:04-14:32).

65 As a mitigation for the problem of this synchronous communication, Z4 mentions the introduction of asynchronous communication, which employs the duplication of the necessary data for each service. Thereby, the independency of every service will be increased (Z4, [00:18:08]). Yet, the creation of duplicate data leads to further complexity especially at a higher scale, as illustrated by Z4:

“But it has a downside: it's significantly more complex. You have much more moving systems; you need to replicate the data into your own database. If, for some reason, this other service just starts sending thousands or millions of events, you might bring your own system down doing that. You need more data storage, but you have this data not one time, but many times” (Z4, 18:48-19:13).

Despite the complexity arising from dependencies and duplicates in modular CNAs, Z4 mentions the complexity to manage application changes within distributed CNA architectures (Z4, 14:56-15:16). More general, the freedom to use different technologies for different microservices comes with an increased management effort compared to a monolithic application with consistent technology:

“On a monolith, you just deploy one application. For microservices, you deploy many.

If you have to do this manually, the effort scales linearly with the amount of systems that you have and the amount of the problems that you need to do” (Z4, 20:24-20:37).

Therefore, Zalando’s CNAs must integrate automation, e.g. with continuous delivery tools. Yet, the addition of a continuous delivery layer further increases the IT complexity (Z4, 20:38-20:39).

The flexibility to use the “best-of-breed” technology for CNAs entails that product teams may also integrate less standardized technologies:

“It also leads to the problem […] that there's too much freedom when it comes to technology. You have small islands of things that not a lot of people know about” (Z2, 42:02-42:19).

Since the knowledge required to develop and operate applications based on less popular technologies are very specific, it becomes difficult to manage them in the long run. If the skills to manage and maintain applications are bound to specific teams or even single software engineers, it becomes challenging to support these services when the necessary knowledge leaves the company:

66

“For instance, we have a small community that builds their services in Rust, but if those people leave, then you would need to either refactor that service or hire someone who knows how to run a Rust service” (Z2, 42:20-42:35).

Z2 further illustrates how “knowledge islands” emerge in manifold technology areas:

“The same thing happens with other technologies. We have a lot of Cassandra-based applications. Cassandra is a database technology, and applications are running on top of it. Over the years, we lost a lot of Cassandra expertise. Now, we basically have teams running the services on top of the technology that they don't really know” (Z2, 42:51-43:17).

The creation of siloed knowledge presents a contrast to the benefits of ecosystem sharing as outlined previously. Senior software engineer Z1 further adds on the emergence of “knowledge islands”:

“If everyone [developers] are doing different programming languages, different frameworks, [have] different ways to deploy, if people change teams - it gets very complicated” (Z1, 26:17-26:32).

In order to limit the development of IT complexity while still benefiting from the flexibility of CNA, Zalando’s IT management introduced efforts to mitigate the negative impacts. As a result of Zalando’s learnings with the STUPS platform, these efforts include the limitation of technology openness:

“Now we are also working a little bit the other way now to make it more restrictive, to have a more coherent way of doing things because it makes it very hard to understand what's going on in the whole infrastructure” (Z1, 26:04-26:15).

By defining certain “best-practice” technologies, Zalando fosters a less fragmented IT landscape without compromising the performance of services, e.g. through the Technology Radar. With this approach, Zalando manages to benefit from the manifold sources of value creation and capture of CNAs while mitigating the impact of complexity resulting from “too much” freedom:

“You can see the different technologies that we use both in terms of framework, data management, infrastructure and languages and then the different states. If you want to create a new service, we prefer if you use the technologies that are actually proven in Zalando or that are already in the adopt stage” (Z2, 45:30-46:00).

Conclusively, the introduction of CNAs brought complexity to Zalando’s IT landscape. The freedom of teams to implement the technologies of choice via STUPS combined with the inherent

67 complexity of microservice architectures led not only to a lack of transparency but eventually difficult application maintenance and slowed down system performance. Further, the technology openness and the thereby emerged technology islands created dependencies on the know-how of specific employees. Altogether, these drawbacks resulting from the CNA implementation led to the restriction of technology for the benefit of more control and standardization at Zalando.

With IT complexity representing the final empirical observation within Zalando, the analysis further continues with the presentation of the second case, which is Adidas.

4.6. Case presentation: Adidas

In document The Strategic Case for Cloud-Native (Sider 65-68)