Công Nghệ File

What technique has been used to protect copyrighted material by inserting digital watermarks into a file?

Steganography is closely related to watermarking, and more frequently, people use the terms interchangeably. However, the two techniques share some similarities and differences. Steganography and watermarking are information embedding techniques that allow including data in other digital data such as documents. While the purpose of steganography is to hide data in media files like video, audio, still images, and documents for secret message transmission, the aim of watermarking is to protect content from malicious use and deter unauthorized copying.

While steganography makes secret information undetectable, visible watermarking allows anyone to see it. It is essential to ensure watermarking is robust so that intentional and unintentional manipulation would not compromise, remove, or destroy the information in any way in the marked media. On the other hand, fragile watermarks are intended to get destroyed even by minimal modification of the content.

Steganography is used for secret communication by its very nature, whereas watermarking is used for content protection, authentication, and copyright management.

The steganography process hides a message for later retrieval by the intended recipient, preventing external parties from detecting it. In this case, the technique allows secret communication while concealing the existence of a hidden message. An attacker without knowledge of the process cannot detect the message in the document. On the other hand, watermarking conceals data to protect digital media like video, photographs, or audio.

Steganography and Digital Watermarking are both forms of information hiding where the context can be viewed as keeping the information a secret or making the information subtle respectively. Both these methods have been around for a long time, at least several centuries, however they have only gained worldwide popularity in the digital world since roughly the mid 1990s. With new technologies constantly emerging, these two methods have been used to prevent theft, prevent plagiarism, track, and hide secrets among other things. There is not one standard way of implementing these techniques and each version differs from one another. Thus, with the vast amount of applications these techniques can be applied to, many companies have made software to create and detect both techniques.

What is Steganography?

Steganography is the method of concealing a message, and its very existence is a secret and only known to the person[s] who placed it. Historical steganography involved disappearing inks and masking letters to hide the information. While modern steganography, information hiding involves hiding data in digital media where by slightly altering the media causes a message to be embedded without anyone noticing any loss to the actual media file. Since steganography is a form of "security through obscurity", where it protects the messages and the known parties through secrecy, it is often combined with cryptography so that even if the message is found, it still cannot be read. [9]

What is Digital Watermarking?

Digital watermarking is the process of embedding a symbol of some sort into the media file to distinguish the file from others. The embedded watermark is meant to be permanent and follows a process in which there is alterations to the media file. Digital watermarking can be hidden or seen in plain sight by other parties. Deciding on whether to use visible/invisible watermarking depends on the function of the media file, but with either technique, when the media file is copied the digital watermark gets copied as well. Digital watermarks are designed in such a way that they do not restrict the media file or change the media file in such a way it detracts the quality. Its main purpose is to detect misuse and act as a form of signature from the owner. [5]

History

The word steganography was derived from the Greek words steganos, which means "covered", and graphia, which means "writing". The first recorded entry of the use of steganography is in the story of Herodotus and his slaves. Where he shaved the heads of his slaves and tattooed secret messages on their scalps. He would than let their hair grow back to hide the message and send his slaves to deliver the messages to other leaders. Since than, many early proposed steganographic techniques have been created from hiding messages in jewelery, to modifying letter strokes and sizes, and placing a mask over text. Many of these techniques are still used today and were used during World War I and II.

Other techniques that were more sophisticated that were used as well during the wars were microdots, which were microscopic images that were shrunk to resemble tiny specs of dirt. These microdots were placed on a person or in a letter and were used to transmit vital information or locations. From this moment and with the emergence of the Internet, steganographic techniques began increasing. Due to the amount of computer networks and digitalized media, covert steganography became favorable to communicate between parties. There have been many uses of steganography, good and bad, in which they provide an extra layer of security for businesses but could also be used by criminals for illegal means. [11]

The word watermark is thought to have been derived from the German word wassermarke and is termed as the marks were thought to have resembled the effects of water on paper. Watermarking first began in Italy in 1282 where wired patterns were created during paper production which left designs in the paper once it molded. The purpose is unclear as to why watermarking was done back than. However in the eighteenth century, paper made in Europe and America using watermarking technology was to ensure trademarks, track when the paper was created, indicate size of the sheets, and prevent counterfeiting. [10]

Counterfeiting especially, caused watermarking technology to rapidly increase as there were those attempting to stop counterfeiters and those attempting to remove or copy the watermarks. From the basic imprints on paper, watermarks moved onto color where dyed ink was inserted into the paper during the molding process. The next advancement was to create a raise in the paper causing a slight bump on the surface, which is commonly seen on paper currency today. Finally, the idea of digital watermarking appeared between in 1979, where Szepanski discussed a machine that could be used on documents which would place a pattern for anti-counterfeiting purposes. In 1988, Holt described a way of embedding an identification code in an audio signal and the term digital watermark first appeared. It was only until 1995 that the world became vastly interested in the idea of digital watermarks. From then on, many papers were released discussing ways of implementing digital watermarks and companies ranging from the music industry to business software began using this technique for security and protection. [8]

Applications

Since the emergence of the Internet, digital distribution of media and documents has vastly increased and is constantly increasing every year. On this note, the distribution of copyright material has always been a concern for people trying to protect their work and this is where digital watermarking can become useful. Those who are worried of people eavesdropping on their conversations by email, important documents being read, or video chat being seen by others can use steganographic techniques to protect themselves. In order to know what applications these techniques can be used for, it is best to know the different ways of implementing the techniques. Steganography and digital watermarking can be broken down into four different subcategories in which the existence of the technique is either known or unknown to the public along with the original file.

Covert watermarking is embedding a watermark related to the recipient of each copy of the file, but unaware that the watermark exists. Thus, if the file is leaked to third parties, the recipient who leaked the file can be traced. Covert steganography is embedding data unrelated to the original work. An overt watermark is when the presence of the watermark is known to others besides the creators, whether visually or told of the watermark. Overt steganography is similar to covert steganography in terms of the hidden information is unrelated to the signal in which it is embedded in. However with overt steganography, the information is only hidden to certain parties but visible to others, example timestamps. [7]

Knowing the subcategories allows for the following possible applications for both steganography and digital watermarking. Some applications may use both techniques depending on the function required.

Some applications of steganography are [7]:

Protection of Data Alteration - "Digital Certificates" as an example, act as a way of protecting data by embedding the information.
Confidential Communication - Where steganographic techniques could provide a way of communicating between two parties without others knowing.
Media Database Systems - Use of embedded messages in files to quickly identify them.
Access Control - Using access keys to extract content from a steganographic file.

Some applications that use digital watermarking are [5,6]:

Broadcast Monitoring - Identifying where and when the media files are being broadcast by looking at the embedded watermarks.
Owner Identification - The embedded watermark will identify the owner of the media files as a way of copyright protection.
Transaction Tracking - Track media files that were illegally distributed or to determine the route the file took.
Content Authentication - The embedded watermarks can add as signatures where the information can be used to authenticate the originality of the file.
Copy Control - Watermarks can be used in recording equipment to determine what content may be copied.
Device Control - Watermarks can be used to make devices react a certain way and display certain content.
Legacy Enhancement - Watermarks can be used to improve functionality of existing systems.

Companies all over the world have been creating software programs to be used for the above applications. Commonly used programs for steganography are Steganos, Stego Suit, and Stealthencrypt. Software used for digital watermarking are Visual Watermark, Ais Watermark, and WatermarkIt. Encryption methods and watermarking schemes offer others the satisfaction of knowing that their data has an extra layer of protection from prying eyes.

Properties

Both methods can be illustrated by properties depending upon the application and the role each method will play. Properties are an important part for both methods as by weighing what properties matter to you based on the method for your application will determine the effectiveness of the system. Since for each method there are many different implementation techniques, the properties that apply to one method are different to that of another since each method list different priorities. Below are the common properties commonly associated with the two methods.

Properties of Steganographic Systems

Some properties of steganography are [7]:

Security - The ability to resist attacks whether passive, active, or malicious.
Embedding Capacity - The maximum number of bits that can be hidden in a given media file.
Blind or Informed Extraction - Whether the other party has a copy of the original file.
Embedding Efficiency - The number of secret message bits embedded/unit distortion.
Statistical Undetectability - The probability of detecting a steganographic technique based off assumptions.
False Alarm Rate - The probability that an algorithm will detect and report the presence of a secret message when there is none.
Stego Key - Possibility of using a publicly known algorithm to embed a secret message into the media file.

Properties of Watermarked Systems

Some properties of digital watermarking are [5,7]:

Robustness - The ability of the watermark to survive processing of content.
Security - The ability of the watermark to prevent attacks or removal.
Embedding Effectiveness - The probability that the embedder will successfully embed the watermark in a random media file.
Fidelity - The perceptive quality of a watermarked content.
Data Payload - The amount of information that can be carried in the watermark.
Transparency - The level of the opacity required to make the watermark visible or invisible.
False Positive Rate - The rate in which the watermark will be falsely detected in non-watermarked files.
Modification and Multiple Watermarks - The possibility of changing the watermark or embedding multiple watermarks.
Cost - The cost required to embed the watermark into the digital file.

Model and Techniques

Steganography and digital watermarking are two different techniques, however they both share similar qualities. Both techniques follows the same basic model in which two inputs [a signal and the original file] go into an embedder. The signal corresponds to the digital watermark or the secret message and the embedder function contains an algorithm that will produce a watermarked file. The output from the embedder is than transmitted or recorded. The recorded file is than sent to a third party in which the possibility of a modification, whether malicious or not, could arise. The recorded file, modified or not, is than sent as input to a detector in which the algorithm tries to determine if the signal is present. If it is, the watermark or secret message can usually be extracted depending on the properties. [10]

For all the applications steganography and digital watermarking can be applied to, there have been many different techniques and variations. Below are a list of the most popular techniques currently being used today in digital media.

Steganography Techniques

Some techniques of steganography are [3]:

Least Significant Bit Insertion

Least Significant Bit [LSB] insertion is having the rightmost bit of a binary number inserted with a different value. Which means that for every 8-bit binary number of an image, the LSB is replaced with 1-bit from the hidden message. The downside of this technique is that if the media file, say an image, is ever converted from one format to another [GIF to JPEG] the information in the message could be distorted.

Masking and Filtering

The masking and filtering technique hides the secret message by integrating the message into the noise level of the original file. The noise level is the redundant portion of the image and allows for the message to hide in this area. Thus, this technique will be able to protect the message from image processing and conversion. However, this technique is usually only suitable for 24-bit and gray-scaled images.

Algorithms and Transformations

Algorithms and transformations technique uses mathematical functions in compression algorithms to hide the secret messages. This technique hides the data bits in the least significant coefficients. This technique is used on JPEG images or files using lossy compression as these files use the discrete cosine transform [DCT] to achieve its image compression. By altering the coefficients of DCT, the secret message changes the relation of the coefficients to one another instead of the actual bits themselves.

Digital Watermarking Techniques

Some techniques of digital watermarking are [4]:

Least Significant Bit Modification

Least Significant Bit [LSB] modification is the same process as that of Steganography and the watermark can be embedded multiple times if the watermark is small enough. Adding to the LSB algorithm, is to have a pseudo-random generator in which where the watermark is placed is random. Thus if the watermark undergoes an attack and one of the watermarks survive, than the process would be a success. The downside of LSB is that its not robust enough where any form of conversion would destroy the watermark. As well, if an outside party does discover the algorithm of the watermark, they would only need to change the last bit of each pixel to 1 and this would not deteriorate the overall quality of the media file.

Correlation-Based Techniques

Correlation-Based Techniques use the correlation properties of additive pseudo-random noise patterns and adding it to the media file. Thus, by multiplying a gain to the watermark and adding it to the image would result in a robust watermarked image depending on how large the gain was set to. The strength on this technique is that the image can be divided into multiple parts and the watermark can be added to each part. Additional filters can be added onto this technique to increase the chance of the watermark surviving an attack, such as pre-filtering the image to reduce the correlation of the gain to the original work or to use a CDMA Spread-Spectrumto scatter the bits.

Frequency Domain Techniques

Frequency Domain Techniques embeds the watermark in the Discrete Cosine Transform [DCT] where the image is broken down into different frequency bands. This makes it easy to place the watermark as the bands chosen are the middle bands with low frequencies in order to make the watermark less obvious on the image. This causes the watermark to be robust and resistant against lossy compression or image degradation. Other techniques can be merged with this one, such as the pseudo noise patterns in order to protect the watermark even further.

Wavelet Watermarking Techniques

Wavelet Watermarking techniques uses the Discrete Wavelet Transform [DWT] to separate the image into different resolutions. Once the image has been separated, the watermarked image can be embedded into these regions. The reason to use such a technique is that higher quality watermarks can be added, which in turns increases the robustness of the watermark without impacting the image quality. The CDMA spread-spectrum can again be added to this technique in order to increase the security of the watermark. It has also been said this technique would be able to survive attacks such as conversion and image altering.

Countermeasures

Countermeasures or detection techniques against covert steganography or watermarking requires the use of the light spectrum, magnification lenses, chemical mixtures, search algorithms, and others are used to determine if a file has been embedded with either these techniques. The use of the light spectrum, and chemical mixtures can only be used for hard copy works of the media files in question. Magnification lenses can be used on both hard copies and digital copies to magnify the image to a size where the watermarked or hidden message can be found. Search algorithms can be done by hand, however it can be quite difficult and is usually done by computers and is the most likely used form of detection. As there are companies creating software to protect secret messages and embed watermarks into files, there are also companies creating software to remove and detect encoded files. [2]

In steganography, the detection of steganographic messages is called steganalysis where the algorithm takes the original file and compares it to the file that is thought to have the secret message. However this only works if a known clean copy of the original is available. In digital watermarking, the technique of watermark detection, informed detection, is used to find invisible watermarks. Watermark detection can follow the same principle algorithm that steganalysis goes through by obtaining the original work and comparing it with one that is believed to have the invisible watermark. Another way of detecting invisible watermarks is by blind detection, where the detecting algorithm is provided with limited information of the original work to find the watermark. [9] Thus as you can see, although steganography and watermarking are very effective, there are ways for them to be removed or detected. For every technique, there must be a way of detection as the rightful owners need a way of checking/tracking their work and thus if those with illegal intentions discover the way of detection, they too can find and remove the methods.

What are the two major forms of steganography?

What are the types of Steganography?.

Text steganography − It includes hiding data within the text files. ... .

Image steganography − It can hiding the information by taking the cover object as image is defined as image steganography..

In which format are most digital photographs stored?

For image capture, the JPEG standard [denoted by the *. jpg extension] is the most widely used file format. Invented by the Joint Photographic Experts Group, it can be found in everything from the simplest point-and-shoot model to the most sophisticated SLR.

Is a data hiding technique that uses host files to cover the contents of a secret message?

Steganography works by hiding information in a way that doesn't arouse suspicion. One of the most popular techniques is 'least significant bit [LSB] steganography. In this type of steganography, the information hider embeds the secret information in the least significant bits of a media file.

Which data hiding technique replaces bits of the host file with other bits of data?

The two major techniques are insertion and substitution. Insertion places data from the secret file into the host file. When you view the host file in its associated program, the inserted data is hidden unless you analyze the data structure. Substitution replaces bits of the host file with other bits of data.