Sharks in the Moat
Page 54
The type of testing I just described is best represented by a technique called fuzzing, or testing by the injection of random information and then observing how the application behaves. The data produced by fuzzing is called fuzz data or fuzzing oracle, and can equally apply to both white and black box testing. With white box testing, fuzzing is targeted to how the source code is written, and it is easier to ensure full code coverage since everything is known about the various paths that need to be tested. With black box testing that has zero knowledge, there is little guarantee that all possible paths are covered.
Fuzz data can be created using either recursion or replacement. In recursive fuzzing, fuzz data is created by iterating through all possible pre-defined values. With replacement fuzzing, all possible values are not pre-defined as a set to be iterated through, but instead are created using an algorithm to replace portions of the data until all possible values have been created. Independent of how fuzzing data is created, there are two approaches to fuzzing execution – generation-based and mutation-based.
Generation-based fuzzing requires some knowledge of the internal algorithm or process, usually as a result of carrying out white box testing. This approach examines the expected data structures, messages and sequence of events, and ‘messes’ with the input so it does not match what the code expects. Because the data is generated in a purposeful way and requires knowledge of the internal mechanisms, this is often called smart fuzzing or intelligent fuzzing. The downside of smart fuzzing is that it initially requires time to setup and execute, and it will not cover new or proprietary protocols since knowledge is required. On the upside, smart fuzzing will result in greater code coverage.
The alternative to smart fuzzing is dumb fuzzing, officially called mutation-based fuzzing. Dumb fuzzing has no foreknowledge of the data structure or protocols used, and it relies on existing data samples to figure this out, using either recursion or replacement to create the fuzz data. The type of testing I carried out and described previously was a form of dumb fuzzing, and it can often lead to DoS or destruction of data. This does not mean it has less value than smart fuzzing – it simply means that dumb fuzzing is best carried out in a non-production environment ‘just in case’.
Figure 143: Software-Related Intellectual Property
Intellectual Property (IP) Ownership and Responsibilities
Regardless of your views on capitalism, at its core lies the principle of protecting intellectual property, or IP. The World Intellectual Property Organization, or WIPO, defines IP as the ‘creations of the mind’. Capitalism is based on the idea that if you invent something from your mind, then you should profit from it in your pocket. The protection of IP is all about that core concept – preventing someone else from being able to reproduce the hard work of your own mind, thereby preserving your ability to profit from it. Profit doesn’t always mean money though – many people invent things simply because they want to help the human race, with recognition of their achievements being reward enough. Recognition can sometimes be the greatest reward.
Getting back to the world of software, the protection of IP is one of the fundamental purposes for supply chain management, which we cover in the next section. In this chapter we are going to deal mostly with the types of licensing and leave any discussions of implementing protection in the supply chain for a different conversation.
Types of IP
As shown in Figure 143, IP comes in two primary flavors – industrial property, and copyright. Industrial property can be further broken down into two sub-categories – innovation and fair competition. Innovative IP covers the design and creation of technology, such as inventions and trade secrets. Fair competition IP protects consumers by allowing them to distinguish one product or service from another, such as trademarks or brands. Copyright protects authors of written works or artistic works such as paintings. Many people find it odd that software falls under the literary and artistic umbrella instead of the more technologically-focused innovative sub-category. Being a programmer myself, it makes perfect sense as it requires a great deal of creativity to design and implement quality and useful software.
Let’s use a rather silly example to help you keep this clear in your own mind. Suppose you come up with an algorithm that uses a specific calculus formula to determine if random signals from outer space represent intelligent life. You decide to create a device to sell to other alien-seekers. Since the calculus formulas are what sets you apart from other companies selling similar devices, you classify the information as a trade secret and make sure that anyone having access to the secret formulas sign a non-disclosure agreement, or NDA. You then write proprietary software using the formulas and seek legal protection from anyone else stealing your code by getting a copyright for the software. You become wildly successful but cannot manufacture enough devices to keep up with demand. So, you file for a patent for your formulas – by doing this the formulas stop being a trade secret as patents are publicly viewable by anyone. However, no one following the law can use your patent unless they pay you money – which you gladly accept so that others can manufacture devices based on your design. However, you feel that your devices are far superior to other manufacturers, and to promote sales of your own items you heavily market the phrase ‘The Original Makers of Alien-Detecting Gizmos!’. To prevent others from using this phrase, you trademark it.
Patent
The strongest form of IP protection grants an inventor the exclusive rights to an invention, called a patent, as long as the invention is a novel, useful and non-obvious idea that offers a new way of doing something or solving a problem. You can’t patent air, but you can patent an air filter. You can’t patent the concept of solar heat coming through a glass window, but you can patent a new window design that increases the amount of solar heat captured. To ensure inventions are eventually freely available, patents are given to the owner only for a limited time – usually 20 years. During this time, the inventor may choose to manufacture devices using the invention or allow others to manufacture devices based on the design by charging them money for the privilege of using the design. When the patent expires, the inventor can no longer profit from whoever uses his or her design. That doesn’t mean that the owner cannot continue to sell implementations of the design, but he or she can no longer prevent or charge others for using the design.
Returning to the definition of what is patentable, an invention can be a product or process as long as it meets four criteria:
1) It must be of practical use. You probably will not be awarded a patent for a device that channels cow brain powers into convincing pigs to cluck like a chicken. Although that would certainly be entertaining for a short while.
2) It must be novel. The invention must have at least one characteristic that is non-existent with current patents. Using the correct terminology, there must be no prior-art.
3) It must demonstrate an inventive step. You can’t simply take the idea of a blimp, paint it red and call it a ‘Highly-Visible Dirigible’. That is hardly innovative.
4) It must be compliant with and deemed acceptable in a court of law. Patenting a new method to break into someone’s house will not be awarded a patent, unless the patent is filed in a country where theft is legal. It all depends on the jurisdiction where the patent request is filed.
The whole point behind patents is to encourage innovation by rewarding those who expend energy coming up with them. When it comes to software, the debate of whether patents apply still rages. Some countries say yes, others say no. Which makes it problematic when selling software across national borders. The best path is to instead use copyrights to protect software designs, algorithms and program code, which we will get to in just a few minutes.
Trade Secret
When a company is in possession of some type of confidential information that gives them an edge over competitors, we say they have a trade secret. This ‘edge’ can be a design, formula, method, strategy, or information, but must have the foll
owing three characteristics:
1) It must not be generally known or easily accessible. How to mix blue and red to make the color purple is not a trade secret, as most anyone knows how to do this.
2) It must have commercial value that is reduced should it be disclosed. While only you may know how to accurately reproduce the mating call of the bushy-tailed cotton bird, no one cares. You can’t call this information a trade secret.
3) It must be protected by the holder through confidentiality agreements. If you don’t take steps to keep your trade secret ‘secret’, the you can’t complain when everyone finds out.
Valid examples of a trade secret include the formula for Dr. Pepper, or the source code for SQL Server. While copyright is the best protection for source code from an ownership perspective, making all code or even portions of it a trade secret may help in some cases to prevent a competitor from using it. Of course, you will need to add the proper access controls around the source code repository before you can claim to have protected it as a trade secret. Deploying software in an object form will not invoke trade secret protection, and even a non-disclosure agreement, or NDA, may not be enforceable. Technical measures against reverse engineering efforts must be implemented.
Trademark
When your company has one or more competitors engaged in the same market, you will need to find a way to differentiate yourself. Perhaps you are known for the lowest prices, or better quality, or perhaps even better customer service. When a potential purchaser is looking at both you and your competitor side-by-side and about to make an impulse buy, it is crucial that your name or logo says, ‘Buy from me – I am better than that other guy!’ The name or logo is called a trademark, and once registered can no longer be used by anyone else – only by the person or organization that registered it. A trademark grants the owner the exclusive right to use it to identify products or services, or even to license others to use the trademark. While a trademark has a limited lifetime, it can be renewed indefinitely as long as the owner stays on top of things. In this respect, it acts in much the same manner as an Internet domain – it is yours as long as you renew it before it lapses. Fall behind, and you might be out of luck!
A trademark can be a word, letter, phrase, numeral, drawing, a three-dimensional sign, a piece of music, vocal sound, fragrance, color, symbol, or any combination of those just mentioned. There is no need to protect a trademark from disclosure, as by definition a trademark is meant to be seen. Having said that, if a trademark has been registered but not yet disclosed to the public, then it could be seen as a trade secret until the official unveiling.
Obviously, you can’t run out and trademark the letter ‘A’ or the color red. But you can trademark the letter ‘A’ written in your custom font, and even the color ‘dazzle rocket-ship red’ with a specific shade of red. UPS brown is trademarked as ‘Pullman Brown’, as is the ‘A” in the Avengers movie franchise using their distinctive design. When it comes to software, you should seriously consider trademarking your name when the general audience starts associating your name with the functionality provided by the software. For example, how often do we use the name ‘Excel’ to mean a spreadsheet in general? The classic example of a product name that took on the functionality is the brand ‘Band-Aid’ – when is the last time you asked for an ‘adhesive bandage’?
An even better example can be heard down here in Texas, such as:
“Hey, you want a coke?”
“You betcha, bubba.’
“What kind of coke you want?”
‘Uhm, gimme a Dr. Pepper!”
Generally speaking, we Texans tend to take an English grammar rule as more of a ‘suggestion’. And we’re proud of it, dang it!
Copyright
When discussing inventions, a patent protects the idea itself, not whatever is done with the idea. For example, let’s say I invent a new solar-powered Gerbil harness. I will be rewarded a patent for the idea of such a thing, but when I create drawings to illustrate how the harness might be built, then a copyright is awarded. A copyright protects the expression of an idea and includes technical drawings such as software architectural diagrams. By obtaining a copyright, I can now prohibit others from using my technical drawings even if they alter the drawing for their own use. Just as we can charge others for using our patent, we can also charge others who use our solution concepts. While patents usually last for 20 years from the date awarded, a copyright is designed to protect the creator’s heirs as well and often does not expire until 50 years after the creator’s death.
Copyright protection is the best approach to protecting software from a legal perspective, although its usefulness is extremely limited in areas of the world where software piracy is rampant, and the government does little to prosecute violators. The most offensive global areas of software piracy is led by China, but some people might be surprised to note that the U.S. is often listed as the second worst offender due to companies using unlicensed business software. Iran, Russia and India complete the top 5 slots.
Peer-to-peer sharing of files is the most notorious vehicle for copyright violations. All software should present the end-user with an end-user license agreement, or EULA, that must be explicitly accepted. While this does help when prosecuting violators, it does not stop them from using the software. Implementing a ‘phone-home’ capability in which your software contacts a publicly-accessible licensing server can help stop illegal software from functioning, but can have a negative usability impact. For example, Microsoft Office will check to see if you have a valid license agreement on-file by connecting to a central server, but if that service goes down for an extended time it can be truly frustrating.
Licensing (Usage and Redistribution Terms)
A software licensing agreement is a contract that spells out the terms and conditions of how a specific program can be used. Violating a license can result in a company paying penalties and being publicly identified as someone who illegally uses software.
There are two categories of licensing types – free and paid. Each category has several types that we will cover.
Free licensing can be one of three types – open source, freeware or shareware. With an open source license, the software may be used pretty much in any way the user likes without any type of payment – using, copying, distributing, or modifying it are perfectly acceptable uses. A company can even use open source code in their own product and charge for that product. However, the open source license normally requires that a copy of the license accompany each copy of software that uses the open source code.
A freeware license is also free, but the source code cannot be redistributed. Adobe Acrobat is a well-known example.
A shareware license is initially free but requires payment at a later date to keep the software functioning or to unlock all features or benefits. The free period is often called a trial period.
Paid licensing has six different types that differ based on how each copy of the software is allowed to be used.
Per CPU licensing charges a fee for each CPU core running on the computer on which the software is installed. This is normally used for high-end servers with multi-core CPUs.
A per seat license is used when multiple users will be using the software. A seat usually means a single named user.
A license may also limit the number of concurrent users, which is the number of users simultaneously using the system, as opposed to named users, which are simply login accounts. For example, a software package may have 2,000 named users in the database but will only allow 200 of those users to be logged in simultaneously.
A per workstation license is used to install software on a single computer and allow any number of users to access it, which effectively means one user at a time.
Finally, an enterprise license allows unlimited use of the software throughout a single organization without worrying about any of the rules we just covered.
Paid licensing software is sometimes called closed source, as the s
ource code is not usually provided with the software. It is normally called off-the-shelf, or OTS, with multiple variations of commercial-off-the-shelf, or COTS, government-off-the-shelf, or GOTS, and modifiable-off-the-shelf, or MOTS. Software that is licensed as a bundle with hardware is called original equipment manufacturer software, or OEM software.
Figure 144: License Types
MOTS licenses will include source code, and the license agreement allows the purchaser to modify the source. When used for the U.S. military, the software is called military-off-the-shelf, which unfortunately, also results in the same acronym of MOTS. Figure 144 illustrates the various types of licensees and their relationships.