Evolution of Distributed Computing Systems
In this article, we will see the history of distributed computing systems from the mainframe era to the current day to the best of my knowledge. It is important to understand the history of anything in order to track how far we progressed. The distributed computing system is all about evolution from centralization to decentralization, it depicts how the centralized systems evolved from time to time towards decentralization. We had a centralized system like mainframe in early 1955 but now we are probably using a decentralized system like edge computing and containers.
1. Mainframe: In the early years of computing between 1960-1967, mainframe-based computing machines were considered as the best solution for processing large-scale data as they provided time-sharing to a local clients who interacts with teletype terminals. This type of system conceptualized the client-server architecture. The client connects and request the server and the server processes these request, enabling a single time-sharing system to send multiple resources over a single medium amongst clients. The major drawback it faced was that it was quite expensive and that lead to the innovation of early disk-based storage and transistor memory.
2. Cluster Networks: In the early 1970s, the development of packet-switching and cluster computing happens which was considered an alternative for mainframe systems although it was expensive. In cluster computing, the underlying hardware consists of a collection of similar workstations or PCs, closely connected by means of a high-speed local-area network where each node runs the same operating system. Its purpose was to achieve parallelism. During 1967-1974, we also saw the creation of ARPANET and an early network that enabled global message exchange allowing for services hostable on remote machines across geographic bounds independent from a fixed programming model. TCP/IP protocol that facilitated datagram and stream-orientated communication over a packet-switched autonomous network of networks also came into existence. Communication was mainly through datagram transport.
3. Internet & PC’s: During this era, the evolution of the internet takes place. New technology such as TCP/IP had begun to transform the Internet into several connected networks, linking local networks to the wider Internet. Thus, the number of hosts connected to the network began to grow rapidly, therefore the centralized naming systems such as HOSTS.TXT couldn’t provide scalability. Hence Domain Name Systems (DNSs) came into existence in 1985 and were able to transform hosts’ domain names into IP addresses. Early GUI-based computers utilizing WIMP(windows, icons, menus, pointers) were developed which provided feasibility of computing within the home, providing applications such as video games and web browsing to consumers.
4. World Wide Web: During the 1980 – the 1990s, the creation of HyperText Transfer Protocol (HTTP) and HyperText Markup Language (HTML) resulted in the first web browsers, websites,s, and web-server. It was developed by Tim Berners Lee at CERN. Standardization of TCP/IP provided infrastructure for interconnected networks of networks known as the World Wide Web (WWW). This leads to the tremendous growth of the number of hosts connected to the Internet. As the number of PC-based application programs running on independent machines started growing, the communications between such application programs became extremely complex and added a growing challenge in the aspect of application-to-application interaction. With the advent of Network computing which enables remote procedure calls (RPCs) over TCP/IP, it turned out to be a widely accepted way for application software communication. In this era, Servers provide resources described by Uniform Resource Locators. Software applications running on a variety of hardware platforms, OS, and different networks faced challenges when required to communicate with each other and share data. These demanding challenges lead to the concept of distributed computing applications.
5. P2P, Grids & Web Services: Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers without the requirement of a central coordinator. Peers share equal privileges. In a P2P network, each client acts as a client and server.P2P file sharing was introduced in 1999 when American college student Shawn Fanning created the music-sharing service Napster.P2P networking enables decentralized internet. With the introduction of Grid computing, multiple tasks can be completed by computers jointly connected over a network. It basically makes use of a data grid i.e., a set of computers can directly interact with each other to perform similar tasks by using middleware. During 1994 – 2000, we also saw the creation of effective x86 virtualization. With the introduction of web service, platform-independent communication was established which uses XML-based information exchange systems that use the Internet for direct application-to-application interaction. Through web services Java can talk with Perl; Windows applications can talk with Unix applications. Peer-to-peer networks are often created by collections of 12 or fewer machines. All of these computers use unique security to keep their data, but they also share data with every other node. In peer-to-peer networks, the nodes both consume and produce resources. Therefore, as the number of nodes grows, so does the peer-to-peer network’s capability for resource sharing. This is distinct from client-server networks where an increase in nodes causes the server to become overloaded. It is challenging to give nodes in peer-to-peer networks proper security because they function as both clients and servers. A denial of service attack may result from this. The majority of contemporary operating systems, including Windows and Mac OS, come with software to implement peer
6. Cloud, Mobile & IoT: Cloud computing came up with the convergence of cluster technology, virtualization, and middleware. Through cloud computing, you can manage your resources and applications online over the internet without explicitly building on your hard drive or server. The major advantage is provided that it can be accessed by anyone from anywhere in the world. Many cloud providers offer subscription-based services. After paying for a subscription, customers can access all the computing resources they need. Customers no longer need to update outdated servers, buy hard drives when they run out of storage, install software updates or buy a software licenses. The vendor does all that for them. Mobile computing allows us to transmit data, such as voice, and video over a wireless network. We no longer need to connect our mobile phones with switches. Some of the most common forms of mobile computing is a smart cards, smartphones, and tablets. IoT also began to emerge from mobile computing and with the utilization of sensors, processing ability, software, and other technologies that connect and exchange data with other devices and systems over the Internet.
The evolution of Application Programming Interface (API) based communication over the REST model was needed to implement scalability, flexibility, portability, caching, and security. Instead of implementing these capabilities at each and every API separately, there came the requirement to have a common component to apply these features on top of the API. This requirement leads the API management platform evolution and today it has become one of the core features of any distributed system. Instead of considering one computer as one computer, the idea to have multiple systems within one computer came into existence. This leads to the idea of virtual machines where the same computer can act as multiple computers and run them all in parallel. Even though this was a good enough idea, it was not the best option when it comes to resource utilization of the host computer. The various virtualization available today are VM Ware Workstation, Microsoft Hyper-V, and Oracle Virtualization.
7. Fog and Edge Computing: When the data produced by mobile computing and IoT services started to grow tremendously, collecting and processing millions of data in real-time was still an issue. This leads to the concept of edge computing in which client data is processed at the periphery of the network, it’s all about the matter of location. That data is moved across a WAN such as the internet, processed, and analyzed closer to the point such as corporate LAN, where it’s created instead of the centralized data center which may cause latency issues. Fog computing greatly reduces the need for bandwidth by not sending every bit of information over cloud channels, and instead aggregating it at certain access points. This type of distributed strategy lowers costs and improves efficiencies. Companies like IBM are the driving force behind fog computing. The composition of Fog and Edge computing further extends the Cloud computing model away from centralized stakeholders to decentralized multi-stakeholder systems which are capable of providing ultra-low service response times, and increased aggregate bandwidths.
The idea of using a container becomes prominent when you can put your application and all the relevant dependencies into a container image that can be run on any environment which has a host operating system that can run containers. This concept became more popular and improved a lot with the introduction of container-based application deployment. Containers can act as same as virtual machines without having the overhead of a separate operating system. Docker and Kubernetes are the two most popular container building platforms. They provide the facility to run in large clusters and communication between services running on containers.
Today distributed system is programmed by application programmers while the underlying infrastructure management is done by a cloud provider. This is the current state of distributed systems of computing and it keeps on evolving.