CS699 - Software Forensics

Spring 2020

Location: GFS 213
Time: Monday-Wednesday 10–11:40am
Class Number: 29980D

Instructor: Nenad Medvidovic

Electronic Mail
neno@usc.edu
Office
SAL 338
Office Phone
(213) 740-5579
Office Hours
Monday 12–1pm or by appointment

Overview

Software forensics is frequently defined as the science of analyzing software source code or binary code to determine whether intellectual property infringement or theft occurred. As such, software forensics is the centerpiece of lawsuits, trials, and settlements when companies and/or individuals are in dispute over issues involving software patents, copyrights, and trade secrets. Furthermore, software forensics tools are usually touted as automated aids for comparing code to determine differences and correlations between software systems. In turn, such measures can be used to guide a software forensics expert in rendering an opinion regarding infringement, theft, copyright violation, or trade secret breach.

This usual characterization does not tell the entire story, however. In practice, forensics goes well beyond code comparison and spans a wide range of key software engineering activities. Forensics experts are very experienced software engineers. Real-world forensics problems, such as patent infringement cases (e.g., the widely reported series of lawsuits between Apple and Samsung), almost never involve simple and naïve code copying. Instead, an expert must consider a huge amount of available information—source code, but also technical documentation, software design artifacts, websites, datasheets, marketing materials, sworn testimony from system developers, etc.—to understand a system, family of systems, or product line in question. Forensics experts must use well-worn software engineering techniques and tools such as code inspections, design reviews, static analysis, and software testing to understand and organize this information. Specifically, the information must be organized and cast in a way that allows its comparison to another source of information—either another system (or set of systems) or a patent (or family of patents). In the case of system-to-system comparisons, forensic experts must establish required degrees of similarity between key aspects of the respective systems’ architectures and implementations. In the case of system-to-patent comparisons, forensic experts must essentially treat a patent as an amalgam of software requirements and design information, and establish whether the system in question implements them faithfully.

Those familiar with software and its construction will know that the above are complex, and critical, software engineering tasks. This class will, therefore, study the range of software
forensics issues from an explicit software engineering perspective. The objective of the class is two-fold:
  1. The class will expose the students to the state-of-the-art practices applied in this domain.
  2. The class will highlight areas where software forensics can benefit from an explicit software engineering perspective, as well as where software engineering as a discipline can be improved by adopting the best forensics practices.
The students will be assigned a series of readings that will balance the theoretical foundations of software forensics and their practical applications and implications.

Academic Integrity

Students must work independently on all individual assignments; collaborating on individual assignments is considered cheating and will be penalized accordingly. All USC students are responsible for reading and following the USC Student Conduct Code, which prohibits plagiarism. Some examples of behavior that is not allowed are: copying all or part of someone else's work (by hand or by looking at others' files, either secretly or if shown), and submitting it as your own; giving another student in the class a copy of your assignment solution; consulting with another student during an exam; and copying text from published literature without proper attribution. If you have questions about what is allowed, please discuss it with the instructor.

Students who violate University standards of academic integrity are subject to disciplinary sanctions, including failure in the course and suspension from the University. Since dishonesty in any form harms the individual, other students, and the University, policies on academic integrity have been and will be strictly enforced.

Readings

Recommended Textbook

Bob Zeidman. The Software IP Detective’s Handbook: Measurement, Comparison, and Infringement Detecton, Prentice Hall, 2011
  • ISBN-10: 0137035799
  • ISBN-13: 9780137035793

Supplemental Readings

Additional readings will be assigned throughout the semester.

Lecture slides will be accessible on-line prior to each lecture, by going to the appropriate Lecture Topic in the Schedule.

Grade Breakdown

Name

Description

Weight

Participation

Students are expected to complete the readings before each lecture and to actively participate in discussions.

10%

Presentation

Each student will present one topic pertinent to the course, during the second half of the semester. The topic selection and presentation preparation will be done in coordination with the instructor.

20%

Exam

The written exam will assess the students' understanding of course material and ability to use information covered in class to think critically about different facets of software forensics.

30%

Course Project The project will involve a practical aspect of software forensics (e.g., empirical assessment of commonly used tools, survey of an important but under-studied sub-area of forensics, etc.). Project details will be announced in Week 7.
40%

Schedule

This part is subject to change; Check it regularly.

Week

Date

Lecture Topic

Readings

Assignments and Exams

1

1/13

  • Chapter 2 – Intellectual Property Crime

1/15

  • Case study 1 – Prominent IP disputes
  • Online Resources
  • Research and come prepared to discuss one software IP dispute case

2

1/20

  • Martin Luther King’s Birthday (no class)

1/22

  • Chapter 6 – Copyrights
  • Chapter 7 – Patents
  • Chapter 8 – Trade Secrets

3

1/27



1/29


4

2/3


2/5

5

2/10

  • Chapter 3 – Source Code
  • Chapter 4 – Object Code and Assembly Code
  • Chapter 5 – Scripts, Intermediate Code, Macros, and Synthesis Primitives
 

2/12

  • Chapter 12 – Software Differentiation Applications
  • Chapter 13 – Software Plagiarism Detection

6

2/17

  • Presidents’ Day (no class)

2/19

  • Software V&V Addenda:
  • Chapter 10 – Software Differentiation Theory
  • Chapter 11 – Software Differentiation Implementation

7

2/24

  • Chapter 22 – Detecting Copyright Infringement
  • Chapter 23 – Detecting Patent Infringement
  • Chapter 24 – Detecting Trade Secret Theft
2/26
  • Chapter 15 – Theory
  • Chapter 16 – Implementation
  • Chapter 17 – Applications

8

3/2

  • Case study 2 – Influential software patents
  • Understanding patents as software engineering artifacts

3/4

  • Chapters 6, 7, 8

9
3/9
  • PRESENTATION / PROJECT TIME

3/11
  • PRESENTATION / PROJECT TIME
  • Presentations selected
    (by 3/6 at 11:59:59pm)

10

SPRING RECESS

11

3/23

  • Chapters 22

3/25

  • Chapters 23, 24

12

3/30


4/1


13

4/6

4/8


14

4/13


4/15

  • EXAM

15

4/20

  • Exam Recap
  • Project discussion


4/22


16
4/27
  • GUI widget recognition and correlation
    • Nikola Lukić and Saghar Talebipour
  • Source code similarity detector for UCC
    • Elaine Venson
  • Flexible architecture recovery and visualization
    • Marcelo Laser



4/29
  • Similarity detection on obfuscated Elf Binary Files
    • Sima Arasteh
  • Analysis of code-clone detectors on BigCloneEval
    • Michael Shoga
  • Analysis of Web accessibility failures
    • Paul Chiou

  • Final projects due (May 6, 11:59:59pm)