When it comes to contributing to Open-Source Software (OSS) — it’s of utmost importance for every individual, team or organization/community to have a set of principles to initiate the discussion and guide the progress of responsible sharing with the rest of the world.
In this article, I’m going to talk about the principles I follow when I engage with OSS at work.
There are 3 major areas of focus to be aware of when it comes to responsibly engaging in OSS communities and contributing to public repositories.
These areas are Data, Operations and Exposures. They cover end-to-end the aspects that might be targeted in the OSS world — let’s talk about these aspects in detail.
The data aspect of any software project is one of the most crucial pieces that power up any system. In the software industry today, data has become more important than the software that produces that data itself. Most of the acquisitions that happen today are mainly targeting acquiring users’ information not the software that collected that information.
But data can be represented in 3 different forms. Access data, identification data and information. Let’s dive into those.
0.0 Access Data
This is data that refers to passwords, secrets, tokens or any other form of data that can be leveraged to access a non-public resource like a database, a protected API or any other system.
Every system mandate one form or another of security. Access data sometimes get mixed up in the operational routines engineers develop to integrate distributed systems with one another.
0.1 Identification Data
Identification data includes but is not limited to data that can identify individuals, organizations, communities or any other identifiable entity. Identification data can come in different forms, such as any communication method, email addresses, phone numbers, SSNs.
But identification data can also indicate locations, physical addresses, languages, locales, background medical information and anything else that can be used to profile an individual.
In general, information is meant to be shared. There’s a whole category of OSS called Open Data where individuals, organizations and communities can share open datasets with the OSS world to expedite and empower researchers everywhere to leverage that data with AI/ML to develop efficient systems.
But some information is not for sharing. At least not in the short term. Information that is for investigations, private entities and sectors and other industries where sharing that information could be harmful to the business is not meant to be shared.
That type of information needs to be secured and protected behind firewalls, encryption and any other available or potential security mechanism to ensure that information doesn’t fall into the wrong hands.
OSS contributors should follow the following guidelines to avoid leakage or exposure of any identification information at all costs:
- Avoid hardcoding any data at all costs. Regardless of how preliminary, exploratory or experimental the work may be — refrain from hardcoding any data if the project you are working on is or will be tracked by any source control mechanism
- Use non-tracked/ignored local settings files all the time for offline testing purposes — ensure that your local settings files are not shared or transferred over any communication medium
- Use Key Vault or any other user-principle-based system to ensure engineers are required to self-identify before retrieving configuration data
- Use encryption for data at rest to avoid potential breaches and leakage of information
- When possible, use bogus data to test your systems in lower environments and less secure settings to avoid data exposure
In the OSS world, operations refer to all the routines engineers develop to achieve a certain objective. For instance, in a schooling system, an operation would be all the code involved in delivering a student registration workflow. This includes validation, error handling, recovery and tracing in addition to processing data and persisting it.
Some routines can be more indicative of a security issue in terms of certain data types for exploitation. Missing validations and missing error handling flows. Wrongdoers can leverage these vulnerabilities in existing routines to infiltrate any publicly accessible system and utilize these vulnerabilities to cause damage.
Developing system flows and routines that are fully secure is much more complex than securing secrets and private or identifiable data. That’s simply because routines are indicative of what can be done versus what already is as is the case in data.
Here’s a general set of actions and practices OSS contributors should follow to ensure these vulnerabilities are covered:
- Follow best practices and engineering standards to ensure these potential vulnerabilities are covered systematically while developing certain flows for their systems.
- Follow modern software development practices such as TDD and Pair Programming to ensure every routine gets the right amount of attention and review before being released out in public.
- Keep up to date with the latest frameworks updates, standardization and engineering best practices as the engineering community continues to discover and share more secure and powerful ways to implement business workflows.
There are certain situations where regardless of how standardized a routine may be — the modeling and simulation process itself can be indicative of private non-public initiatives, products or future releases. OSS contributors must consult with their architects, designers and tech leaders to ensure these routines are properly designed in a non-indicative manner for any future products.
Exposure covers all the communications, discussions, designs and demos for any given software. It’s important to understand the level of exposure an OSS contributor may have to discuss certain design decisions with the OSS community.
This also includes comments on PRs and public demos and potential interviews, podcasts, YouTube videos or articles about the OSS product. In some cases, the roadmap is part of the OSS contribution. The community controls the roadmap — but that’s not always the case in some partially open-sourced projects and for these projects OSS contributors must be cautious with their engagements and communications on the public forum.
Here’s some guidelines OSS contributors could follow to ensure an exposure violation is least likely to occur:
- Use your best judgement in engaging with the OSS community in terms of future topics and what certain flows are potentially going to be used for.
- When in doubt, consult with your OSS champion on your team and your leads before engaging in any discussion that you deem suspicious or leading towards exposing some information
- If a leakage already occurs, report immediately to your leadership team so they can make the best decision about handling the situation
In summary, engaging in OSS is a great advantage to share our knowledge with the world, but it also helps foster an environment of responsible sharing and giving back to the community. But just like every other technology, a wrongdoer could leverage a positively purposed initiative to cause harm and damage. Therefore, we have to strictly follow these principles to ensure that we protect ourselves, our organizations and our communities and users that entrusted us with their data to empower them to achieve more.