cover

Behind the Scenes: The Prompts and Tricks That Made Many-Shot ICL Work

2 Jun 2025

Appendix details prompts, selection robustness tests, GPT4V-Turbo comparisons, and medical QA extensions validating many-shot ICL methodology.

cover

Scientists Just Found a Way to Skip AI Training Entirely. Here's How

2 Jun 2025

Many-shot ICL enables quick model adaptation without fine-tuning, improving accessibility. Future work: other tasks, open models, bias reduction.

cover

How Many Examples Does AI Really Need? New Research Reveals Surprising Scaling Laws

2 Jun 2025

Gemini 1.5 Pro shows log-linear gains up to ~1K examples (+38% accuracy). Batching reduces costs 45x and latency 35x with minimal performance loss.

cover

The Science Behind Many-Shot Learning: Testing AI Across 10 Different Vision Domains

2 Jun 2025

Evaluates GPT-4o vs Gemini 1.5 Pro on 10 vision datasets with many-shot ICL, using stratified sampling and standard accuracy/F1 metrics.

cover

Why Thousands of Examples Beat Dozens Every Time

2 Jun 2025

Many-shot multimodal ICL with thousands of examples improves LMM performance. Gemini 1.5 Pro shows log-linear gains; batching reduces costs.

cover

How CODEX Model Size Influences COCOGEN's Output Quality

24 Apr 2025

Exploring the impact of model size on COCOGEN's performance, CODEX-002 outperforms CODEX-001. Prompt sensitivity decreases as model size increases.

cover

COCOGEN vs DAVINCI: A Human Evaluation of Structured Commonsense Graph Generation

24 Apr 2025

Human evaluation shows COCOGEN outperforms DAVINCI in generating more relevant and correct commonsense graphs for tasks like EXPLAGRAPHS and PROSCRIPT.

cover

Using Code-LLMs for Structured Commonsense Reasoning

24 Apr 2025

COCOGEN pioneers the use of Code-LLMs for structured commonsense generation, opening new avenues for NLP tasks that require structured prediction and reasoning.

cover

Unlocking Structured Commonsense Reasoning with Code-LLMs

23 Apr 2025

COCOGEN pioneers the use of Code-LLMs for structured commonsense generation, opening new avenues for NLP tasks that require structured prediction and reasoning.