Statistics and machine learning expert with 4 years of experience finding creative solutions to complex problems,
specializing in NLP, time series analysis, and presentation.
A proven track record in improving efficiency, developing NLP products, and automating business processes.
A visionary leader with strong communication skills, passionate about leveraging AI to solve complex business problems and drive innovation.
Expertise includes developing and deploying advanced machine learning models, collaborating with cross-functional teams,
and providing technical leadership throughout project lifecycles.
MATHEMATICS: Linear Algebra, Bayesian & Frequentist Statistics, Probability, Modeling, Experimental Design
GENERATIVE AI: Langchain, LLMs, RAG, PEFT, LoRA, Prompt Engineering, Transformers, GANs
MACHINE LEARNING: Tensorflow, Time-Series, Natural Language Processing, Deep Learning, Feature Engineering, Supervised/Unsupervised Learning, Gradient Boosting, Sci-kit Learn
DATA TOOLS: Snowflake, Python, SQL, Plotly, Streamlit, Dash, Pandas, Numpy
SOFT SKILLS: Project Planning, Requirements Gathering, Documentation, Written Communication, Verbal Communication, Multitasking, Quantitative Research, Qualitative Research, Public Speaking, Creative Problem-Solving
Data Scientist
- Reprogrammed algorithms for efficiency, decreasing runtime of two products by 20% and 80%
- Designed and implemented a new NLP product that leverages the Google Search API to map unstructured,
uncleaned text to corporate names with 93% accuracy
- Automating this process, the rate of mappings went from 3,000/year to 130,000/year
- Overhauled unit testing procedure, wrote over 100 unit tests
- Created dozens of pages of new documentation and experiment tracking procedures
- Devised and coded an automated hyperparameter tuning system, saving 100+ man-hours per year
- Architected a churn model from very limited data that automated a manual process, saving over 70 man-hours per month
Data Science Fellow
- Worked with an experience data science mentor on a project-by-project basis
- Reinforcement Learning: Built an AI investment management agent that generated positive returns
- NLP: Researched partisan bias in 15 publications, used clustering to determine which media are “mainstream”
- Business Impact, Project Planning: Evaluated and recommended potential upgrade projects;
estimated revenue increases from $33 million to $50 million annually
Graduate Teaching Assistant
- Developed lesson plans/presentation to teach complex mathematics to non-technical students
- Lectured to 4 classes of 30+ students, achieving a pass rate over 50% in a developmental course
Founder and Mentor
- Lead a team of new data scientists through 14 weeks of lessons and projects
- Create a new lesson and mini-project every week
- Explain highly technical mathematical concepts without using advanced math
- Culminate in a big team project at the end
YouTube Creator - Data Science and Quantitative Research
- Research wide-ranging topics in data science, AI, and quantitative finance
- Created educational videos on all parts of the data science lifecycle and machine learning process
- Check it out!
- DS Bootcamp Playlist
CNN-LSTM Forecasting and Portfolio Optimization
- Wrangled, compiled, cleaned, and explored datasets from multiple databases
- Selected stocks from the NYSE to create a low-correlation investment universe
- Devised an economic factor model of stock returns and created cutting-edge neural network models
to predict behavior of each stock
- Optimized stock portfolio to achieve maximum risk-adjusted reward
- Backtested investment strategy – Outperformed benchmarks; Annual Sharpe: 0.90
Ensemble Reinforcement Learning for Futures Trading
- Loaded, cleaned, and prepared price data on 78 futures contracts
- Coded training, validation, and trading environments for use with OpenAI Gym
- Implemented A2C, PPO, and DDPG reinforcement learning algorithms and combined them into one ensemble,
AI-driven trading strategy
JohnnAI Resume Assistant
- Uses the OpenAI API along with Langchain to power a RAG chatbot with chat history
- Can answer questions about my experience, education, strategy, and philosophy
- Deployed on Streamlit
- Embedded on this page or see it on Streamlit
Sentiment in Reporting: A data-driven analysis of bias in 15 major publications
- Performed sentiment analysis on over 140,000 news articles
- Used linear regression and Tukey HSD hypothesis testing to highlight significant differences in sentiment
- Used Agglomerative Hierarchical clustering, K-Mean clustering, and DBSCAN to explore relationships
between publications and divide news sources into groups based on political sentiment
Graduate Student, Applied Mathematics
Master of Science, Mathematics
Master’s Thesis – Detecting Bubbles in the USD-JPY Exchange Rate by Sequential Monte Carlo Methods
- Implemented an SMC^2 nested particle filter to draw inference on bubble likelihood and parameter values simultaneously
- Devised, simulated, and tested a financial management strategy based on the results
- Programmed this complex econometric model in R