research-methods.md 6.7 KB

Investigating SSH Information Leaks

Primary Investigator

Thomas Flucke

Department of Computer Science and Software Engineering, College of Engineering

Faculty Advisor

Bruce DeBruhl

Department of Computer Science and Software Engineering, College of Engineering

Statement of Purpose, benefits, and hypotheses

In this experiment, I plan to explore possible data leaks in the common secure communications program SecureShell (SSH). This program is commonly used in industry and within Cal Poly as a means of creating secure connections between a user and a remote computer. I hypothesize that the current implementation is flawed in such a way that I can use publicly visible patterns in encrypted internet traffic to discover information such as who is using the program and the content of the encrypted data. I will test this hypothesis with an experiment which involves having participants type into the SecureShell program. The participants would be volunteers among students in the computer science department. The risk to volunteers is minimal and the potential discovery is significant. The data will be collected in an environment isolated from their personal computer. Data will be anonymized to not contain any personal identifiable information.

Methods

Subjects and Subject Characteristics

Volunteers will be gathered from systems classes in the Cal Poly C.S. department. These classes will be upper-division classes related to systems and security programming. All of the classes will have Systems Programming (CPE 357) as a prerequisite or be the Systems Programming class.

Investigators

Thomas Flucke, Student, California Polytechnic State University, College of Engineering, Department of Computer Science and Software Engineering

Materials and Procedures

Materials

A "router" which records encrypted traffic and forwards it to it's destination. The router will be configured to only accept connections involved in this study and will only store encrypted information.

A virtual machine with a key logger installed and configured to send traffic through the "router". The virtual machine isolates monitors used in this study from materials unrelated to this work.

Procedure

Dataset 1

Before the Experiment

Volunteer will be provided consent form 1 and an explanation of the experiment and how the information they provide will be used.

During the Experiment

Volunteers will respond to a series of prompts with complete English sentences. Prompts will not ask for any personally identifiable information. This will provide a baseline of data for network traffic patterns of normal English typing. Then the experimenter will ask the volunteer to copy an English paragraph into the terminal. This will provide a more structured dataset which eliminates variations between respondents. Because this instruction is more strict than the prior instruction, the typing patterns may be skewed, altering data generated by typing and thus both are necessary. The volunteer will enter their responses into a monitored terminal with a SecureShell connection to the computer science department's servers over the internet. The router will observe public network traffic of the connection and record the data.

Then the volunteer will be asked to perform a few basic tasks over the SecureShell connection. This will provide data for the traffic generated by normal usage of SecureShell. Afterwards volunteers will execute a series of basic Unix commands over the SecureShell connection. This is important to provide a similar structured baseline for comparison as with the English paragraph, but in the context of Unix commands rather than English. Again both the volunteers' inputs and network traffic will be monitored and recorded.

In total, responding to the prompts should not take more than 30 minutes. The test can be paused if the volunteer requests a break.

An investigator will be present for the entire test to give directions and clarifications.

Volunteers will be permitted to make adjustments to the chair, table, or other surroundings in order to make themselves more comfortable.

The volunteer's data will be marked with a one-way-encrypted hash of their Cal Poly username. This allows the data to be uniquely identified without risk of connecting the data to a particular volunteer or exposing the volunteer's participation in this study.

Dataset 2

Before the Experiment

Volunteers will be provided consent form 2 and an explanation of the experiment and how the information they provide will be used.

Volunteers will be provided a virtual computer described above and informed that any information entered into the computer will be recorded. The volunteer will be discouraged from using the virtual machine for purposes unrelated to the study. The Volunteer will be informed that their usage of SecureShell is not the subject of study.

If a volunteer does use the virtual machine for any purpose not related to the study, the extraneous data generated will be filtered from the dataset. If the data cannot be filtered, the volunteer's contributions will be discarded.

Volunteers will also be informed that their participation or refusal in the study does not impact their grade and will not hinder their ability to complete their classwork.

During the Experiment

Volunteers will perform their normal class work using the virtual machine as the terminal to the school server. Once on the school server, the volunteer will complete their class work as they otherwise would. This data will provide an organic dataset which represents real world use cases of SecureShell.

The additional time to complete the tasks while participating in this study is negligible.

An investigator will be present for the entire test to give directions and clarifications.

The volunteer's data will be marked with a one-way-encrypted hash of their Cal Poly username. This allows the data to be uniquely identified without risk of connecting the data to a particular volunteer or exposing the volunteer's participation in this study.

Study Location

The participants of the study will perform their work on Cal Poly campus. Dataset 1 will be gathered with the participants physically in the Cal Poly computer Science building 14. Dataset 2 will be collected in the volunteer's usual classroom setting. Additionally, data will be recorded on a router hosted on Amazon Web Services (AWS) on the internet.

Informed Consent Form

All participants will be asked for informed consent before participating. Participants in both datasets will be asked to sign both informed consent forms.

Debriefing Statement for Projects Involving Deception and Incomplete Disclosure

No deception or incomplete disclosure will be involved in this project.