← All projects

Deep learning in practice: ViT image classification and DQN multi-agent

Kaggle competition Top 10% (validation Acc ≈ 98%) + hands-on multi-agent reinforcement learning.

PyTorchTransformersViT/DeiTDQNReinforcement learning

Problem

Spend long enough at the application layer and people start to ask whether you understand the fundamentals. These two projects are the model-side proof: one a supervised-learning competition, the other multi-agent reinforcement learning.

Approach

ViT/DeiT fine-tuning: layer-wise learning rates + Label Smoothing + a RandAug/Mixup/CutMix augmentation combo; DQN multi-agent: training 4 Agents to cooperate on round-trip transport in a 5×5 grid, with a shared-network DQN plus a yield-priority mechanism.

Results

Top 10%

Kaggle ranking

≈ 98%

ViT validation accuracy

95% (average steps -20%)

Multi-agent transport success rate

AI's role in this project

Proof of the model-side fundamentals: the fine-tuning strategy, data augmentation, reward design, and multi-agent coordination were all tuned by hand.

ViT image classification (Kaggle Top 10%)

ViT/DeiT fine-tuning, with the key techniques being: layer-wise learning rates (small steps for lower layers, larger for the top), Label Smoothing, and a RandAug/Mixup/CutMix augmentation combo — validation accuracy around 98%, competition ranking Top 10%.

DQN multi-agent transport

4 Agents doing round-trip transport in a 5×5 grid: a shared-network DQN lowers training cost, and a yield-priority mechanism resolves path conflicts — a 95% success rate, with average steps cut by 20%. The "coordinate and yield" design in multi-agent is the same class of problem as the conflict arbitration in today's multi-agent systems.