notes.md 13 KB

Thesis Journal

Quarter 1 (Spring 19)

2019-04-15

  • Background research
    • "Timing analysis of keystrokes and timing attacks on ssh" - Song et al.
    • Basically the exact same attack applied to passwords. Very useful.
    • Found several papers using acoustics to determine keystrokes
  • Found useful wireshark filter for ssh keystrokes (tcp.dstport == 22 and tcp.flags.push == 1)
  • Wrote proposal

2019-04-16

  • Submitted thesis proposal to DeBruhl
  • Discussed logistics of testing
  • Began arrangements with Mammen on using 357 students.
    • Still need to submit paperwork
    • Need to talk to Nico about O.S.

2019-04-18

  • Researched NLP and Hidden Markov Chains

2019-04-19

  • Requested VirtualBox be installed on CSL machines
  • Acquired V.M. for router
    • Decided to use a V.M. on AWS to get realistic network variance/latency
    • Plan: Set up as router
    • Concern: If I make it a proxy router, may be abused. See if I can transparently route or limit to only my subject's VMs.
      • Concern: If I transparently route, BGP may break my MitM. Might not be an issue if I only care about the upstream.
    • Plan: Open necessary ports
  • Researched methods of tracking/tagging SSH connections
    • Surprisingly difficult to track origin machine - Mostly due to University Wi-Fi NAT
    • Nothing in the packet meta-data uniquely and consistently identifies a victim
    • SSH man page describes session variables which may be useful
    • SSH_CONNECTION contains the victim port from the remote host perspective
      • I can use the timestamp + src port to link a connection to an ID
      • Tried printing SSH_CONNECTION to local file on connection init (failed)
      • Tried printing SSH_CONNECTION to remote file on init (I don't have access to remote file)
      • Started working with ncat to send ID + time + port to router
      • I can then join on time + port to assign each SSH flow to an ID
      • ncat unencrypted by default, privacy concern + potential abuse
      • Unix servers uninstalled ncat, not viable
      • Plan: write a basic web server, TLS encrypt, limit to only accept Unix server connections, use CURL to send ID
      • Concern: Unix server IPs may not be static. Unlikely but possible.
  • Found several simple keyloggers
    • Plan: Find one which can monitor one process at a time. Must also include timestamps

2019-04-20

  • Realized using V.M. as router and be on the internet doesn't work because I can't force it to be the default gateway
    • Try making router impersonate Unix VM's by forwarding connection and add routing rule to victim VM
    • How do I tell the difference between a legitimate SSH connection or one to be forwarded?
      • Use a different port for real SSH connections?
      • Might be easier to just use VPNs and hook on new connections

2019-04-21

  • Gathered preliminary data on Unix commands
    • Used my ubuntu VM sans emacs which I installed afterwards
    • 1828 commands installed by default, most of them length 8 (normal-ish distribution around it)
    • A lot of these seem like administrative commands
    • 315 non-privileged commands, most of them at length 7

2019-04-22

  • Began IRB paperwork
  • Began writing prompts for students

2019-04-23

  • Plan: Talk to Lupo/Pentoja about Fall Quarter
  • Spoke with Debbie Hart
    • Mention that I have no influence over grading status
    • Mention whether or not there is difference in ability to complete homework
    • Ensure students don't feel pressured to participate
    • Possibly two consent forms
    • Responses are not used, only information generated by the system
    • If any part of analysis uses plain responses, submit and IRB
  • Submitted IRB paperwork

2019-04-24

  • Wrote keylogger script for SSH
    • Concern: No way to no trace the password entered into SSH. Might have to instruct students on RSA keys.
  • Wrote up Context-Aware SSH docs

2019-04-27

  • Extensively tested key logger

2019-04-29

  • IRB approved
    • Submitted revised consent form
  • Talked to Tedd about getting VirtualBox on lab machines
  • Tested OpenVPN
    • Measured added latency (8.8.8.8 reference point): ~12.5ms
  • Wrote tcpdump filter for only traffic SSHing into another system
    • Considered: Fixing to only unix server
    • I don't know if the IPs will change and it's still potentially valid data
    • Considered: Filtering for only PSH packets (all keystrokes are PSH)
    • I might need the extra information later so I'll hold on to it.
  • Tested pairing network packets to keystrokes
    • Difference between key and packet observation: ~11.8ms
    • First SYN packet is within 100ms of SSH starting in log

2019-04-30

  • Discussed thesis progress with DeBruhl
    • Password entry is questionable
    • Maybe write script that generates secureshell key if asked for password?
    • Might have to just filter it out
    • Decided port mapping might be unnecessary
  • Ethan (357 student) expressed interest in project
    • DeBruhl offered to do 400 in Fall
  • Set up CRON to automatically run packet tap on reboot + app armor permissions
  • Server suddenly cannot connect to Unix1
    • Forgot to save IPTables rule

2019-05-01

  • Started writing script to automatically pair keylog files to packet flows
    • Got to the point where it could match the start of a file to a TCP SYN

2019-05-02

  • Finished and tested script
    • Was able to match all packets with 50ms time difference

2019-05-03

  • Tested multiple keylog files.
  • Discovered that different keylogs have different delays.
    • Delay within one log file fairly consistent

2019-05-06

  • Created scripts which copy keylog file to router VM.

2019-05-07

  • Changes in delay attributable to AWS server instances changing
  • Tested two SSH sessions at same time
    • Found uses fd 5 instead of 4 - Not sure why
  • Found issue with VM routing
    • Apparently, upon reboot it adds tun0 without a VPN being turned on. Then tries to route all traffic through inactive device
    • --No idea why.--
    • Found old configuration. systemctl task was trying to open VPN separately.
  • Set up VPN to automatically enable when VM boots
  • Plan: Separate out each flow into a separate pcap

2019-05-08

  • Worked on separating each TCP flow into a separate pcap
    • Apparently editcap can't do this on it's own so I'm writing my own utility for this

2019-05-09

  • Deployed VM to Unix machines upstairs
    • Apparently most people don't have enough space to house VM's
    • Might just have to make VM's smaller
    • Tested stripping down unneeded packages, didn't do much.
      • Might just have to use headless-ubuntu
  • Tested running Dataset 1 with Griffian
    • Issues with DNS resolution on VPN
    • Packets did not seem to capture --(may have just not flushed yet)--
    • Packets confirmed not captured.
    • Total time ~20 (gave longer than expected answers)
  • Set up Lubuntu VM (~1/2 the size)

2019-05-10

  • Fixed issues with VirtualBox version differences (CSL was running 6, I was running 5)
  • Gathered data from Lucy
  • Wrote script to automatically filter the packets before sending them to server.

2019-05-13

  • Got VM's working on CSL computers - Had to install to /tmp for space reasons
  • Tested client configuration with 3 people simultaneously
    • Sequentially each of us had the connection break
    • Each after ~15 minutes, each re-established the connection before the next disconnect occurred
  • Added prompt to script to give students a chance to review/approve data before submitting

2019-05-14

  • Discussed progress with DeBruhl
  • Worked on the disconnect problem
    • --Seems like it happens after every 15 minutes almost exactly--
    • Can't seem to make it happen at all now.
  • Gather data from a few students:
    • One had a disconnect but no noticeable issues
    • Another had a complete disconnect at the very end
    • Testing was completely finish and I was able to manually upload the data
    • Captured log of what happened
      • OpenVPN seems to be detecting itself as a replay attack after network goes down
      • Solution: Set up NTP server (virtual machine system clock is way off)
      • Solution: Use TCP

2019-05-15

  • Talked with Nico about testing O.S./S.S. sections
  • Found bug that caused network to disconnect
    • Network goes down -> link device removed -> route to VPN through device removed -> VPN taffic has no route
    • It takes 2 minutes for OpenVPN to detect the network is broken
      • Once OpenVPN tries to renegotiate, fixing the connection causes errors. Have to full restart OpenVPN
    • Can fix within first two minutes by adding the route back

2019-05-16

  • Gathered more data from 357 students
    • Data velocity very slow
    • Option: Bigger test - Might dissuade more people
    • Option: Skip to dataset 2
    • Option: More classes
    • Option: Reduce dataset requirements

2019-05-17

  • Asked for volunteers from O.S.
  • Met with 357 students from Griffian's section/Elie

2019-05-18

  • Checked that pcaps, log files, and consent forms line up
    • Because of Timezone offsets, some logs on day boundary

2019-07-20

  • Finished refactoring packet matcher to work with flow-seperated pcaps.
  • Began looking over data
    • Seems like a lot of files have incomplete data - may have to rerecord all of it.

2019-07-27

  • Looked into issues with files
    • At least one file seems like it has all the correct data and timestamps, but poor matches
    • Seems packet-matcher has issues
    • Fixed. Seems like it was a problem with my keylog pattern matching. FD is not reliable.
    • Some files still fail
    • Time stamps are way off. Some data has timestamps way after my last recorded flow
      • Possibilities:
      • Problem with flow separator did not split off a file for these flows (best case)
      • Unix Lab sysclock are wrong (Unlikely but recoverable case)
      • Problem with recording system did not capture data (worst case)
      • Investigation:
      • See if I have a syn-packet for one of my problem flows
        • No Syn packet found. Nothing close. Looking at server to see if I have data at the source.
        • No packets recorder after 2019-05-21.
        • I have several flows from 2019-05-23 that were missed.
        • Other flows are from way earlier. May be a different problem.
        • Some packet captures empty. Seems data capture system needs serious work.

2019-08-11

  • Emailed Ethan
  • Emailed DeBruhl
  • Bought Amazon gift cards
  • Looked for patterns in missing data
    • None found
    • May be a failure in data collection or in data parsing
    • On the server nothing from day 23. 3 keylogs from that day.
    • Maybe it has to do with the connection dropping randomly?

2019-08-22

  • Collected data from work on personal server
    • Collection done in lab to best recreate the scenario under which students worked.

2019-09-08

  • Examined new data from 8-22
    • Data collected fine at first, truncated
    • Truncation happened at midnight UTC (within 1 second), after exactly 2000 key strokes
      • Both those numbers are extremely suspect, and that they happen to coincide is unfortunate
      • Will investigate both these numbers
      • May be data rotation did not pick up the existing SSH connection (unlikely)
      • SSH may have undergone some kind of renegotiation (should be in logs in this is the case)
    • Data is otherwise usable (only 20% data loss)

2019-09-12

  • Examined script for rotating logs
    • Uses SIGKILL
    • Tested SIGKILL, doesn't flush buffer; SIGINT does
    • Script now uses SIGINT
  • Switched server to use PST so that logs rotate when no one is using it, just in case
    • Will test changes tomorrow

2019-09-23

  • Above bullet was a lie. Testing today.
  • Set up URL for raffle Captcha.
  • Submitted flier for IRB review.

2019-09-24

  • Worked out confusion with IRB about raffle
  • Discussed plans for the quarter with DeBruhl
  • Discussed work/plans for the project with Ethan
  • Examined files from 09-23
    • Files still truncated. Unknown reasons.

2019-09-25

  • Emailed professors asking permission to advertise in their class
  • Server hasn't been rebooted in 44 days.
    • Script was updated 2 weeks ago. New script might not be running.
    • Rebooted today.
    • Will test that new script is working tonight.

2019-09-26

  • Looked at matched file generated from yesterday
    • Not all the keys matched
    • All packets seem to be accounted for. Nothing visibly truncated
    • Possible events happening:
      • Some packets grouped together
      • Some packets not recorded for unexplained reasons
      • Some packets don't match the standard pattern
  • Updated script to tag guided/freeform work

2019-09-27

  • Handed new VM to Tedd for deployment

2019-09-30

  • Checked if new VMs deployed (they are not).
  • Upgraded packet matching script: succeeded on 9-25 data.
    • Also seems to work with prior, formerly thought to be corrupted, data