notes.md 10 KB

Thesis Journal

Quarter 1 (Spring 19)

2019-04-15

  • Background research
    • "Timing analysis of keystrokes and timing attacks on ssh" - Song et al.
    • Basically the exact same attack applied to passwords. Very useful.
    • Found several papers using acoustics to determine keystrokes
  • Found useful wireshark filter for ssh keystrokes (tcp.dstport == 22 and tcp.flags.push == 1)
  • Wrote proposal

    2019-04-16

  • Submitted thesis proposal to DeBruhl

  • Discussed logistics of testing

  • Began arrangements with Mammen on using 357 students.

    • Still need to submit paperwork
    • Need to talk to Nico about O.S.

      2019-04-18

  • Researched NLP and Hidden Markov Chains

    2019-04-19

  • Requested VirtualBox be installed on CSL machines

  • Acquired V.M. for router

    • Decided to use a V.M. on AWS to get realistic network variance/latency
    • Plan: Set up as router
    • Concern: If I make it a proxy router, may be abused. See if I can transparently route or limit to only my subject's VMs.
      • Concern: If I transparently route, BGP may break my MitM. Might not be an issue if I only care about the upstream.
    • Plan: Open necessary ports
  • Researched methods of tracking/tagging SSH connections

    • Surprisingly difficult to track origin machine - Mostly due to University Wi-Fi NAT
    • Nothing in the packet meta-data uniquely and consistently identifies a victim
    • SSH man page describes session variables which may be useful
    • SSH_CONNECTION contains the victim port from the remote host perspective
      • I can use the timestamp + src port to link a connection to an ID
      • Tried printing SSH_CONNECTION to local file on connection init (failed)
      • Tried printing SSH_CONNECTION to remote file on init (I don't have access to remote file)
      • Started working with ncat to send ID + time + port to router
      • I can then join on time + port to assign each SSH flow to an ID
      • ncat unencrypted by default, privacy concern + potential abuse
      • Unix servers uninstalled ncat, not viable
      • Plan: write a basic web server, TLS encrypt, limit to only accept Unix server connections, use CURL to send ID
      • Concern: Unix server IPs may not be static. Unlikely but possible.
  • Found several simple keyloggers

    • Plan: Find one which can monitor one process at a time. Must also include timestamps

      2019-04-20

  • Realized using V.M. as router and be on the internet doesn't work because I can't force it to be the default gateway

    • Try making router impersonate Unix VM's by forwarding connection and add routing rule to victim VM
    • How do I tell the difference between a legitimate SSH connection or one to be forwarded?

      • Use a different port for real SSH connections?
      • Might be easier to just use VPNs and hook on new connections

        2019-04-21

  • Gathered preliminary data on Unix commands

    • Used my ubuntu VM sans emacs which I installed afterwards
    • 1828 commands installed by default, most of them length 8 (normal-ish distribution around it)
    • A lot of these seem like administrative commands
    • 315 non-privileged commands, most of them at length 7

      2019-04-22

  • Began IRB paperwork

  • Began writing prompts for students

    2019-04-23

  • Plan: Talk to Lupo/Pentoja about Fall Quarter

  • Spoke with Debbie Hart

    • Mention that I have no influence over grading status
    • Mention whether or not there is difference in ability to complete homework
    • Ensure students don't feel pressured to participate
    • Possibly two consent forms
    • Responses are not used, only information generated by the system
    • If any part of analysis uses plain responses, submit and IRB
  • Submitted IRB paperwork

    2019-04-24

  • Wrote keylogger script for SSH

    • Concern: No way to no trace the password entered into SSH. Might have to instruct students on RSA keys.
  • Wrote up Context-Aware SSH docs

    2019-04-27

  • Extensively tested key logger

    2019-04-29

  • IRB approved

    • Submitted revised consent form
  • Talked to Tedd about getting VirtualBox on lab machines

  • Tested OpenVPN

    • Measured added latency (8.8.8.8 reference point): ~12.5ms
  • Wrote tcpdump filter for only traffic SSHing into another system

    • Considered: Fixing to only unix server
    • I don't know if the IPs will change and it's still potentially valid data
    • Considered: Filtering for only PSH packets (all keystrokes are PSH)
    • I might need the extra information later so I'll hold on to it.
  • Tested pairing network packets to keystrokes

    • Difference between key and packet observation: ~11.8ms
    • First SYN packet is within 100ms of SSH starting in log

      2019-04-30

  • Discussed thesis progress with DeBruhl

    • Password entry is questionable
    • Maybe write script that generates secureshell key if asked for password?
    • Might have to just filter it out
    • Decided port mapping might be unnecessary
  • Ethan (357 student) expressed interest in project

    • DeBruhl offered to do 400 in Fall
  • Set up CRON to automatically run packet tap on reboot + app armor permissions

  • Server suddenly cannot connect to Unix1

    • Forgot to save IPTables rule

      2019-05-01

  • Started writing script to automatically pair keylog files to packet flows

    • Got to the point where it could match the start of a file to a TCP SYN

      2019-05-02

  • Finished and tested script

    • Was able to match all packets with 50ms time difference

      2019-05-03

  • Tested multiple keylog files.

  • Discovered that different keylogs have different delays.

    • Delay within one log file fairly consistent

      2019-05-06

  • Created scripts which copy keylog file to router VM.

    2019-05-07

  • Changes in delay attributable to AWS server instances changing

  • Tested two SSH sessions at same time

    • Found uses fd 5 instead of 4 - Not sure why
  • Found issue with VM routing

    • Apparently, upon reboot it adds tun0 without a VPN being turned on. Then tries to route all traffic through inactive device
    • --No idea why.--
    • Found old configuration. systemctl task was trying to open VPN separately.
  • Set up VPN to automatically enable when VM boots

  • Plan: Separate out each flow into a separate pcap

    2019-05-08

  • Worked on separating each TCP flow into a separate pcap

    • Apparently editcap can't do this on it's own so I'm writing my own utility for this

      2019-05-09

  • Deployed VM to Unix machines upstairs

    • Apparently most people don't have enough space to house VM's
    • Might just have to make VM's smaller
    • Tested stripping down unneeded packages, didn't do much.
      • Might just have to use headless-ubuntu
  • Tested running Dataset 1 with Griffian

    • Issues with DNS resolution on VPN
    • Packets did not seem to capture --(may have just not flushed yet)--
    • Packets confirmed not captured.
    • Total time ~20 (gave longer than expected answers)
  • Set up Lubuntu VM (~1/2 the size)

    2019-05-10

  • Fixed issues with VirtualBox version differences (CSL was running 6, I was running 5)

  • Gathered data from Lucy

  • Wrote script to automatically filter the packets before sending them to server.

    2019-05-13

  • Got VM's working on CSL computers - Had to install to /tmp for space reasons

  • Tested client configuration with 3 people simultaneously

    • Sequentially each of us had the connection break
    • Each after ~15 minutes, each re-established the connection before the next disconnect occurred
  • Added prompt to script to give students a chance to review/approve data before submitting

    2019-05-14

  • Discussed progress with DeBruhl

  • Worked on the disconnect problem

    • --Seems like it happens after every 15 minutes almost exactly--
    • Can't seem to make it happen at all now.
  • Gather data from a few students:

    • One had a disconnect but no noticeable issues
    • Another had a complete disconnect at the very end
    • Testing was completely finish and I was able to manually upload the data
    • Captured log of what happened

      • OpenVPN seems to be detecting itself as a replay attack after network goes down
      • Solution: Set up NTP server (virtual machine system clock is way off)
      • Solution: Use TCP

        2019-05-15

  • Talked with Nico about testing O.S./S.S. sections

  • Found bug that caused network to disconnect

    • Network goes down -> link device removed -> route to VPN through device removed -> VPN taffic has no route
    • It takes 2 minutes for OpenVPN to detect the network is broken
      • Once OpenVPN tries to renegotiate, fixing the connection causes errors. Have to full restart OpenVPN
    • Can fix within first two minutes by adding the route back

      2019-05-16

  • Gathered more data from 357 students

    • Data velocity very slow
    • Option: Bigger test - Might dissuade more people
    • Option: Skip to dataset 2
    • Option: More classes
    • Option: Reduce dataset requirements

      2019-05-17

  • Asked for volunteers from O.S.

  • Met with 357 students from Griffian's section/Elie

    2019-05-18

  • Checked that pcaps, log files, and consent forms line up

    • Because of Timezone offsets, some logs on day boundary

      2019-07-20

  • Finished refactoring packet matcher to work with flow-seperated pcaps.

  • Began looking over data

    • Seems like a lot of files have incomplete data - may have to rerecord all of it.

      2019-07-27

  • Looked into issues with files

    • At least one file seems like it has all the correct data and timestamps, but poor matches
    • Seems packet-matcher has issues
    • Fixed. Seems like it was a problem with my keylog pattern matching. FD is not reliable.
    • Some files still fail
    • Time stamps are way off. Some data has timestamps way after my last recorded flow

      • Possibilities:
      • Problem with flow separator did not split off a file for these flows (best case)
      • Unix Lab sysclock are wrong (Unlikely but recoverable case)
      • Problem with recording system did not capture data (worst case)
      • Investigation:
      • See if I have a syn-packet for one of my problem flows

        • No Syn packet found. Nothing close. Looking at server to see if I have data at the source.
        • No packets recorder after 2019-05-21.
        • I have several flows from 2019-05-23 that were missed.
        • Other flows are from way earlier. May be a different problem.
        • Some packet captures empty. Seems data capture system needs serious work.

          2019-08-11

  • Emailed Ethan

  • Emailed DeBruhl

  • Bought Amazon gift cards

  • Looked for patterns in missing data

    • None found
    • May be a failure in data collection or in data parsing
    • On the server nothing from day 23. 3 keylogs from that day.
    • Maybe it has to do with the connection dropping randomly?