Linux 101 - Introduction

What is Linux?

The standard operating system (OS) used in HPC environments is Linux. Linux is the colloquial name for the combination of the GNU operating system packages and the Linux kernel, GNU/Linux (for a more comprehensive discussion on this, visit the https://www.gnu.org/ ).

Linux are generally obtained via distributions. A Linux distribution is a combination of GNU/Linux with other components to address specific use cases. Most distributions are focussed on providing a desktop experience that can serve as a replacement for Microsoft Windows or Apple's Mac-OS. Examples of these include Linux Mint (https://linuxmint.com/) and Fedora (https://getfedora.org/) . Other distributions are more focussed on the server environment and include examples such as CentOS (https://www.centos.org/) and Rocky Linux (https://rockylinux.org/). Additionally, you can find distributions that serve no clear functional reason, such as Hanna Montana Linux (http://hannahmontana.sourceforge.net/. All these distributions can be downloaded and used free of charge. It is highly recommended that users of the UFS HPC obtain and use a distribution of Linux on their personal computers, thus please see this separate guide for recommendations and hints on getting started with a Linux distribution on a personal computer.

Open Source Software

The reason for the staggering variety and free availability for Linux distributions is because Linux and many of the components used in the distributions are open source software. This means that the source code for these software are available for inspection, modification and distribution without cost. The source code of a computer program is the instructions, written in a human readable programming language such as C++, needed to build a program.

The build process essentially translates these human readable instructions into machine language (binary) that a computer can understand, which produces a binary program that can be used/executed. As an apt analogy, think of the source code as the recipe that one would use to bake a cake and the build process is the baking process which produces the cake (program) that can be consumed/used by the user.

However, most users are never knowingly exposed to source code / open source software in every day computing. Companies such as Microsoft and Apple only provide users with binary programs and legally prohibits (via User Agreements) them from inspecting, changing and redistributing the source code and/or binaries that are provided to them. And, of coarse, users are charged a fee for the use of these binaries. This type of software is what is generally referred to as proprietary software.

Advantages

The advantages of open source software are numerous but here are some important ones: First and foremost, the software is provided free of charge. Some companies usually monetize the software in other ways, for example by providing support for a fee. However, most open source software are simply free of charge. Another advantage is the transparency of the source code. Because more eyes are on the code, it is often easier to spot and fix bugs. For the researcher/scientist specifically, the transparency coupled to the free redistribution and modification of the software aligns with the philosophical underpinnings of their occupation and is thus a natural fit for most software developed in the research environment.

Disadvantages

However there are some disadvantages to open source software. The most important one to regular users is the complexity of having to compile software (however most of the popular open source software are provided in binary form). Also, most open source projects relies on the spare-time of their developers/community to maintain the software in question. Luckily, because some open source software projects are the pillars of the modern cyber-infrastructure in the world, foundations with funding have emerged to employ some permanent developers for these projects.

With the lengthy discussion of software distribution models out of the way, we can now focus on using Linux the HPC environment.