My research in the Multimedia and Security Team (MAST) is mainly along two directions: Signal Processing Fingerprinting and Content Fingerprinting. Signal Processing Fingerprint refers to the traces left in the multimedia signal as it goes thourgh various signal processing modules. Such traces can be used to expose exotic patterns introduced by user manipulations,and thus are helpful in assessing the integrity of multimedia data. Content Fingerprint is the compact signature that captures unique features of a multimedia content, and can be used to detect illegitimate content duplication. In essence, my research aims to capture and understand both intrinsic and extrinsic properties of multimedia signals, and develop techniques to utilize such properties for multimedia data protection.

Besides these directions, I have also worked on computer vision techniques such as the Scale-Invariant Feature Transform (SIFT), and my previous research at National Taiwan University focused on lossless image compression and semantic content analysis.

Individual projects are highlighted as follows:


1. Analysis of Digital Imaging Technology


As digital images and videos become more popular due to the prevalence of digital cameras, concerns regarding their origin and authenticity have also been raised and received increasing attention. Our research, supported by the AFOSR and DoD, aims to robustly identify the source camera model and camera unit of digital images and videos, by estimating the imaging algorithms (left figure below) and the imaging noise (right figure below), respectively.



1.1 Camera Model Identification using Color Interpolation Characteristics

 
Color interpolation is a common step in digital photography that has a crucial impact on the quality of resulting images. As different camera manufacturers compete with customized color interpolation algorithms to enhance visual quality, the make and model of the source camera can be inferred by identifying the underlying color interpolation algorithm.We analyze state-of-the-art color interpolation algorithms, and construct an identification scheme by means of the enhanced directional color interpolation coefficients (left figure).

Our techniques provides robustness against various factors that may affect the identification. In particular, we analyze the effect of image content, and we show that more training images are required in order to identify the underlying camera model of man-made scene images as compared to natural scene images (right figure). Our scheme has shown an accuracy higher than 95% when more than 20 cameras and cell-phone cameras are jointly used for performance evaluation. Further, develop an algorithm based on convex optimization techniques for selecting training images that match a given testing image. This algorithm is efficient and has shown good a promising accuracy even for unseen image content.

On the other hand, an attacker may conduct anti-forensic operations to counteract the identification of the underlying color interpolation characteristics. It is therefore of critical importance to understand how robust such identification performs under an adversarial environment. Our study shows ways that can manipulate identification results while preserving image quality, and motivates countermeasures that a forensic analyst can adopt to resist attacks.

Keywords: digital image processing, machine learning, convex optimization, image quality assessment, game theory.



Details can be found in:
W.-H. Chuang and M. Wu, "Semi Non-Intrusive Training for Cell-Phone Camera Model Linkage", IEEE International Workshop on Information Forensics and Security (WIFS) , 2010. [pdf] [slides]

W.-H. Chuang and M. Wu, "Content-Aware Camera Model Identification", to be submitted for journal publication.
[preprint available soon]

W.-H. Chuang and M. Wu, "Robustness of Color Interpolation Identification against Anti-Forensic Operations," to appear, Information Hiding Conference (IH), 2012. [preprint available soon]


1.2 Source Camera Identification using Imaging Noise


In addition to identifying the camera model, another important identity of digital images is the source camera unit that is actually used. It has been shown that the imaging noise, also called the Photo Response Non-Uniformity (PRNU) formed during imagining, is a discriminative trace of individual cameras left in the digital images. PRNU can be estimated from a digital image by estimating the weak noise component left in the image. We enhance this technique by modifying the noise estimation module according to the realiability of each image pixel. Further, we extend this technique to strongly compressed digital videos and leverage the difference between different parts of the video to improve the identification accuracy and efficiency. For example, we show that I-frames in a test video, as compared to P-frames, have a higher (about twice) correlation with the reference camera PRNU pattern (left figure). Such a difference, when properly exploited, can be used to reorder and weight individual frames to improve the identification performance. (right figure)


Keywords: digital image and video processing, machine learning, signal detection and estimation theory.

Details can be found in:

  • W.-H. Chuang, H. Su, and M. Wu, “Exploring Compression Effects for Improved Source Camera Identification using Strongly Compressed Video”, IEEE International Conference on Image Processing (ICIP), 2011. [pdf] [slides]


1.3 Implementation of Digital Imaging Analysis Software


We provide software modules for camera model and camera unit identification using digital images and videos under the support of the DoD.  We implement both the MATLAB prototypes and efficient C++ APIs. In order for the software to run promptly and robustly during field use, we optimize the code architecture with complete error handling, resource management, logging, regression testing and version control.



Keywords: C/C++, MATLAB, error handling, version control (git).



2. Tampering Identification by Empirical Frequency Response


The need for authenticating digital photos becomes increasingly important as digital photo editing technology becomes more popular and easier. For a given digital photo, one may ask if it has been tampered or manipulated and further by what type of tampering operation. We focus on the latter question and present a framework based on the Empirical Frequency Response (EFR) that aims to identify the manipulation type. We observe that many classes of LSI or non-LSI image processing operations, such as re-sampling, JPEG compression, and non-linear filtering, exhibit distinctive patterns in their EFRs.

typical_efr

To identify different tampering operations, we construct classifiers based on representative features extracted from the EFRs. In practical scenarios, the original photos are not necessarily available, and we propose to use blind deconvolution methods to estimate them only based on the photos under test.

Keywords: digital image processing, blind deconvolution, natural image statistics, machine learning.

Details can be found in:
  • W.-H. Chuang, A. Swaminathan, and M. Wu, “Tampering Identification Using Empirical Frequency Response”, IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), 2009. [pdf] [poster]


3. Impacts of Ordinal Ranking On Content Fingerprinting


Ordinal ranking can be viewed as a quantization module that maps real-valued fingerprint feature vectors into integer values. This module has been experimentally reported to improve robustness against noise and global transformation. From a viewpoint of modular analysis, we quantitatively model and study the impacts of ordinal ranking, taking different fingerprinting parameters such as length, inter-entry correlation, and distortion strength into consideration.


We first study the impacts of ordinal ranking on the achievable identification performance when global distortion is present. We derive closed-form expressions and our prediction fits both synthetic and real image data very well. On the other hand, strong local variations such as logo insertion into a block may change the ranks of all blocks. This is well known as the sensitivity issue of ordinal ranking, and we provide theoretical understandings of sensitivity and how it might be mitigated. Our analytical understanding eventually leads to two improvements of rank-based representation for higher identification performance.

              


Keywords: digital image and video processing, signal detection and estimation theory, information theory, combinatorics.

Details can be found in:
  • W.-H. Chuang, A.L. Varna, and M. Wu, "Modeling and Analysis of Ordinal Ranking in Content Fingerprinting", submitted for peer review. [pdf]
  • W.-H. Chuang, A.L. Varna, and M. Wu, "Performance Impact of Ordinal Ranking on Content Fingerprinting", IEEE International Conference on Image Processing (ICIP), 2010. [pdf] [poster]
  • A. L. Varna, W.-H. Chuang, and M. Wu, “A Framework for Theoretical Analysis of Content Fingerprinting”, SPIE and IS&T Media Forensics and Security, 2010. [pdf] [slides]
  • W.-H. Chuang, A.L. Varna, and M. Wu, "Modeling and Analysis of Ordinal Ranking in Content Fingerprinting", IEEE International Workshop on Information Forensics and Security (WIFS) , 2009. [pdf] [slides]


4. Evaluating the Quality of SIFT Descriptors


Scale-Invariant Feature Transform (SIFT) is one of the most popular local image features that are widely used in computer vision, image processing and mage retrieval. We study the relation between the SIFT descriptor and its matching accuracy. We propose a framework to quantitatively assess the quality of a SIFT feature descriptor in terms of robustness and discriminability. This would enable us to gain better understanding of the strength and limitations of SIFT in emerging applications of SIFT-based image hash, and also to improve matching accuracy and efficiency in applications such as object recognition. The proposed technique improves the matching by pruning the noisy SIFT feature selection result (the top figure) according to their quality (the bottom result).