Tojo was born and raised in Makakilo on the island of Oahu. He graduated from Kamehameha Schools Kapalama and is currently studying computer science at Columbia University. In his free time, he enjoys programming algorithms while surfing and eating ocean-salted broccoli. He has always loved the sciences and especially computer science and can’t wait to expand his knowledge in the field further. He hopes to one day pursue a career in STEM to make positive impacts on society while satisfying his passion for learning.
Home Island: Oahu
High School: Kamehameha Schools Kapalama
Institution when accepted: Columbia University
Akamai Project: Defining Data Workflow for PEcoC’s Deep Learning Model
Project Site: Maui High Performance Computing Center – Kihei, Maui
Mentors: Jeremy Young
Project Abstract:
The Pacific Ecosystem for Cyber (PEcoC) is an AI/ML-enabled architecture that detects anomalous network traffic to advance full-spectrum cyberspace operations. Currently, PEcoC collects network packet capture (PCAP) data, processes each packet into session-level logs with Zeek (a network monitoring tool), and scores sessions using shallow unsupervised anomaly detection models. Additionally, a deep learning algorithm, an autoencoder for DNS traffic, has been developed to run on raw PCAP. However, it is not operational since it scores individual packets rather than sessions, producing too much data for analysts to process. The objective was to create a refined autoencoder workflow by aggregating packet scores into session scores to make the results tangible. We developed a custom Zeek script to log packet indexes to sessions on the default DNS log. Then we used python scripts to match the autoencoder’s scored packets with their corresponding sessions. Finally, we investigated various packet score aggregations to output as the autoencoder’s session scores, such as the average or maximum packet score per session. Once deployed, this refined autoencoder workflow will give digestible results into the hands of analysts who can then begin giving feedback for us to optimize and tailor the deep learning algorithm.