If you've been working in or around technology at all in the last few years, chances are high that you've heard the words Git or maybe GitHub. You've almost certainly heard the terms Open Source Software or OSS, for short. What do these all mean, and why are these concepts important to understand, as we're examining the software development industry at-large?
First, let's define the two main ways in which software is built and brought to market:
- Closed-Source: The software you work on is typically a commercial product, or exists to serve a commercial interest. Every line of code written is only seen by those working on it. The code, and the intellectual property that goes along with it, is an asset of the individual or company that owns it. Most software that is sold, whether as packaged software or software-as-a-service, fits into this model. The source code behind your favorite web-based service likely does as well. After all, if a company's "secret sauce" has anything to do with the software code, they probably don't want anyone else having access to it. Companies compete by recruiting the best developers to build the best software, and these assets are to be kept behind closed doors under lock-and-key.
- Open-Source: The code behind the software you work on, whether it is a commercial product or not, is available to the public.
You may be asking – what motivates a software developer or company to pour hours – in some cases, thousands of hours – of work into a software project, only to make the code free and fully available to the public? The answer may vary based on the specific situation, but few examples may include:
- Crowdsourcing. By releasing source code to the public, a developer may be hoping for suggestions on improvement or, better still, contributions of work to the code that make the project better over time.
- The idea of transparency and "knowing what's in your food". For example, the developers of Telegram, an open-source messenger app, publish the source code to prove that they're not using your personal data for anything malicious – something that other commercial products can't promise.
- Credibility and reputation. An individual developer may choose to make a portion of their work open-source, in order to showcase their abilities to potential employers or future partners.
- Recruiting. A company may make some of their work open-source to attract potential future employees with the cool stuff they're building.
These are all great reasons for a developer or company to contribute their work to the world of OSS. But the biggest reason is a science-based culture of sharing and "no need to reinvent the wheel". As renowned computer scientist and Stanford CS professor emeritus Donald Knuth once said, "People think that computer science is the art of geniuses but the actual reality is the opposite, just many people doing things that build on each other, like a wall of mini stones." Software, like most things, is better built in cooperation with other humans. This idea – even within the context of commercial enterprises – has largely prevailed in the last decade or so over the concept of closed-source, proprietary software.
Licensing
How open-source software is licensed is a big topic, with many licensing models and derivatives of each model. Generally though, there are two ways in which an open-source developer might decide to treat their software project when sharing it with others:
- You may use this code in your project, but if you modify it, you must license the new work under the same model – the code can be used unmodified without restriction (GNU General Public License and other "copyleft" licenses)
- You may use this code however you want – you can modify it, sell it commercially, or include it within another open-source project (Apache, MIT, and other "permissive" licensing models)
More information on choosing a software license can be found at https://choosealicense.com/. Regardless of the licensing model used, it is certainly possible to use OSS to augment, or as the basis for, commercial ventures.
Case Study: The Linux Family Tree
Here we have the source code for the Linux operating system. The core functionality of Linux (the "kernel") has been utilized (aka "forked") many times for use in both other OSS projects and commercial projects. Developers have taken the open-source Linux kernel source code and created their own "flavors" or "forks" of Linux, such as Ubuntu Linux, Red Hat Linux, Fedora Linux, and so on. It's estimated that nearly 78% of the Internet runs on servers powered by some flavor of Linux (96% of the top 1 million).
Meanwhile, in 2008, a team at Google built another "flavor" of Linux – the Android operating system. Because the Linux kernel is being used to power Android without modification, Google is able to license Android under the Apache License – a less restrictive, permissive license that allows others to modify and/or build on top of it, even for commercial purposes.
Today, Android is, as most of us know, the #2 mobile phone operating system behind iOS. As mentioned, Android is permissively open-source, although Google and its partners do bundle commercial, closed-source products, like Google Chrome and Google Play Store, alongside Android on the mobile hardware they sell. This is a straightforward example of one way open-source can be used to power for-profit enterprises.
Because Android is licensed using the Apache licensing model, developers are free to incorporate, modify, and distribute Android as they wish, even in commercial ventures. As a result, many companies have now forked and transformed the Android OS into the basis for its own products, including manufacturers of tablets, set top entertainment devices, and many other types of hardware. For example, Amazon has its own commercial, closed-source fork of Android – Fire OS – that it uses on its Amazon Kindle Fire products. Facebook has built its own closed-source and proprietary Oculus VR product on top of Android, as has Peloton as the backbone for its home-based fitness devices. All of these ventures can find their roots in Android and ultimately, Linux.
Other Interesting Open-Source Projects
Git and GitHub
Those trying to understand open-source software for the first time often experience a bit of naming confusion among all the products and companies in this particular niche of an industry, so here's just a quick overview:
- Git is source control software, used by developers to store, manage, version, and collaborate on source code. It was created by Linus Torvalds (the creator of Linux) and, naturally, Git is open-source software.
- GitHub is a for-profit corporation (owned by Microsoft) that is in the business of providing Git-related products and services, including Git project hosting and open-source developer productivity tools. Although we will be using GitHub to consume and share source code in this course, there certainly are companies that offer similar services, such as GitLab and BitBucket.
Git/GitHub How-To
Code projects in Git are organized into repositories. Typically, a Git repository houses a single code project, like some of the examples that we've discussed thus far. One of the primary services provided by GitHub is Git repository hosting. A GitHub user or organization will have typically have several Git repositories housed within it. For instance, the Microsoft organization on GitHub contains many repositories, one of which is the source code for the popular Visual Studio Code code editor.
As a first time software developer and GitHub user, you're not going to have any repositories within your GitHub account. It's time to change that.
There are two primary ways to create our own repository in GitHub:
- Create a repository by clicking the New repository option – either start from scratch or from code you already have on your computer, or;
- Go to an existing repository and use it as a template to start your own repository. This is the option we'll begin with, and will be using quite a bit throughout this course.
Start by visiting https://github.com/entr451-spring2024/hello-world and clicking the Use this template button. You'll be directed to a Create a new repository page – for the Repository Name, type hello-world
, set the repository to be Public and click the Create repository from template button. It should take just a few seconds to create the brand-new repository in your GitHub account. You did it – you created your first GitHub repository!
Have a look around – you'll see that a GitHub repository is nothing more than a folder filled with files and other folders. You'll find that you can add your files, as well as edit the files already there. Try editing the README.md
file by clicking little "pencil" icon. Make any change you want, then scroll to the bottom, to the Commit Changes section. The first box, which is pre-filled with Update README.md, is known as the Commit Message. This message is used to make a note to yourself (and others), describing the change you've made. Type something in the commit message (e.g. Made the README more awesome) and click the Commit Changes button. This takes the work you've done – in this case, making an edit to the README.md
file – and commits the work. That is, it takes the unit of work you've performed and uses it to affect a permanent change to your Git repository.
It's important to understand that code and other files that are part of a Git repository are always being watched by Git. Because this constant observation is taking place, Git is able to know the entire history of your code; that is, what files are added, modified, and deleted by you and other contributors over time. This is what gives Git and other version control systems powerful abilities, such as reporting on individuals' contributions to our code project (what was contributed and when), time-travel (the ability to view or revert to our code at a previous period in time), and branching (different versions of the same project that began at the same starting point). Professional software developers use these features (and more) to manage complex software projects each and every day.