Extraordinary achievements are born from the accumulation of ordinary efforts.
I graduated from Seoul National University and currently work at LinqAlpha.
I enjoy both research and engineering, and I am most energized by asking important questions that no one has solved before and working through them to clear, grounded answers.
That process helps me focus deeply and think with clarity.
Research Interest
My vision is to help build multi-modal agents that can control computers and automate labor, so people can focus on what truly matters.
Just as AlphaGo reshaped public expectations by decisively defeating world champions in Go, I believe we will eventually see AI agents that, under the same constraints as humans, can outperform top players in complex online games.
I want to contribute to making that future possible in a way that is not only powerful, but also faithful, reliable, and efficient.
To make that future real, capability alone is not enough: models must remain faithful to evidence while reasoning efficiently, and they must plan, search, and use tools at a level that can exceed human experts across diverse tasks.
Recent advances in AI, with remarkable multi-modal and tool-use capability, increasingly make this future feel within reach.
However, current models still fail in ways that matter in practice: they can drift from the given evidence, reason for a long time through ungrounded reasoning traces, and invoke the wrong tools at critical decision points.
I want to contribute to closing that gap, and my current research interests focus on multi-modal faithfulness, efficient reasoning, planning, and test-time scaling.
Publications
Distributional Alignment as a Principle for Designing Task Vectors in In-Context Learning
In Submission
Jihoon Kwon, Jiwon Choi, Jy-yong Sohn
Preprint
In this paper, we study task vectors for in-context learning and introduce an evaluation metric and an extraction method grounded in the principle of distributional alignment with ICL.
Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions
NeurIPS 2025
Jihoon Kwon, Kyle Min, Jy-yong Sohn
NeurIPS 2025 Poster
In this paper, we propose READ-CLIP, a fine-tuning method that improves compositional reasoning in vision-language models via reconstruction and alignment losses.
Double Major: Business Administration
Relevant Coursework: Machine Learning, Optimization, Statistics
Work Experience
Fundamental Research Engineer - AI/LLM
LinqAlpha
2023/09 - Present
Developing a GUI-based Vision-Language-Action agent to automatically enter, record, and transcribe earnings and conference calls.
Owning research and critical experiments for finance-focused LLM applications.
Building an end-to-end benchmark pipeline for training and evaluating finance-specific LLM systems.
Research Intern
ITML Lab, Yonsei University
2024/07 - Present
Proposed READ-CLIP, a fine-tuning method for compositional reasoning in vision-language models (NeurIPS 2025).
Proposed dNTP, a principled metric for evaluating in-context learning task vectors, and LTV, a training-free method for better performing, low-latency test-time inference.
Researching training methodologies that promote monosemantic feature learning when interpreting vision-language models with sparse autoencoders.
Researching efficient Monte Carlo Tree Search-based test-time scaling for autoregressive LLMs using diffusion-LLM hybrid approaches.
Projects
SQA Alphathon 2025 Winner: Tracking Evolving Signals in Corporate Disclosures
October 2025
Developed an end-to-end LLM system for stock return prediction by detecting strategic metric shifts in earnings calls.
Won SQA Alphathon 2025 with 3.6x better forecasting performance than the original method.
Designed context-aware extraction and semantic scoring to capture evolving corporate narratives.
Led full project execution from problem definition to final validation.